Tag Archives: EFD

VNXe 3300 performance follow up (EFDs and RR settings)


On my previous post about VNXe 3300 performance I introduced results from the performance tests I had done with VNXe 3300. I will use those results as a comparison for the new tests that I ran recently. In this post I will compare the performance difference with different Round Robin policy settings. I also had a chance to test the performance of EFD disks on VNXe.

Round Robin settings

On the previous post all the tests were ran on default RR settings which means that ESX would send 1000 commands through one path before changing the path. I observed that with the default RR settings I was only getting the bandwidth of one link on the four port LACP trunk. I got some feedback from Ken advising to change the default RR IO operation limit setting from 1000 to 1 to get two links worth of bandwidth from VNXe. So I wanted to test what kind of an effect would this change have on performance.

Arnim van Lieshout has a really good post about configuring RR using PowerCLI and I used his examples for configuring the IO operation limit from 1000 to 1. If you are not confident running the full PowerCLI scripts Arnim introduced in his post here is how RR settings for individual device could be changed using GUI and couple of simple PowerCLI commands:

1. Change datastore path selection policy to RR (from vSphere client – select host – configure – storage adapters – iSCSI sw adapter – right click the device and select manage paths – for path selection select Round Robin (VMware) and click change)

2. Open PowerCLI and connect to the server

Connect-VIServer -Server [servername]

3. Retrieve esxcli instance

$esxcli = Get-EsxCli

4. Change device IO Operation Limit to 1 and set Limit Type to Iops. [deviceidentifier] can be found from vSphere client’s iSCSI sw adapter view and is in format of naa.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.

$esxcli.nmp.roundrobin.setconfig($null,”[deviceidentifier]”,1,”iops”,$null)

5. Check that the changes were completed.

$esxcli.nmp.roundrobin.getconfig(“[deviceidentifier]”)

Results 1

For these tests I used same environment and Iometer settings that I described on my Hands-on with VNXe 3300 Part 6: Performance post.

Results 2

For these tests I used the same environment except instead of virtual Win 2003 I used virtual Win 2008 (1vCPU and 4GB memory) and the following Iometer settings (I picked up these settings from VMware Community post Open unofficial storage performance thread):

Max Throughput-100%Read

  • 1 Worker
  • 8000000 sectors max disk size
  • 64 outstanding I/Os per target
  • 500 transactions per connection
  • 32KB transfer request size
  • 100% sequential distribution
  • 100% Read distribution
  • 5 minute run time

Max Throughput-50%Read

  • 1 Worker
  • 8000000 sectors max disk size
  • 64 outstanding I/Os per target
  • 500 transactions per connection
  • 32KB transfer request size
  • 100% sequential distribution
  • 50% read/write distribution
  • 5 minute run time

RealLife-60%Rand-65%Read

  • 1 Worker
  • 8000000 sectors max disk size
  • 64 outstanding I/Os per target
  • 500 transactions per connection
  • 8KB transfer request size
  • 40% sequential / 60% random distribution
  • 35 % read /65% write distribution
  • 5 minute run time

Random-8k-70%Read

  • 1 Worker
  • 8000000 sectors max disk size
  • 64 outstanding I/Os per target
  • 500 transactions per connection
  • 8KB transfer request size
  • 100% random distribution
  • 30 % read /70% write distribution
  • 5 minute run time
[Updated 11/29/11] After I had published this post Andy Banta gave me a hint on twitter:
You might squeeze more by using a number like 5 to 10, skipping
some of the path change cost.
So I ran couple of more tests changing the IO operation limit between 5-10. With the 28 disk pool there was no big difference when used the values 1 or 5-10. With the EFDs the magic number seemed to be 6 and with that I managed to get 16 MBps and 1100 IOps more out of the disks with specific work loads. I added the new EFD results to the graphs. 


Conclusions

Changing the default RR policy setting from the default 1000 io to 1 io really makes a difference on VNXe. On random workload there is not much difference between these two settings. But with sequential workload the difference is significant. Sequental write IOps and throughput is more than double with certain block sizes when using the 1 io setting. If you have ESXs connected to VNXe with LACP trunk I would recommend changing the RR policy to 1 5-10. Like I already mentioned Arnim has a really good post about configuring RR settings using PowerCLI. Another good post about multipathing is A “Multivendor Post” on using iSCSI with VMware vSphere by Chad Sakac.

Looking at the results it is obvious that EFD disks perform much better than SAS disks. On sequential workload 28 Disk SAS pool’s performance is about the same as 5 disk EFD RG’s. But on random workload EFD’s performance is about two times better than SAS pool’s. There was no other load on the disks while these tests were ran so under additional load I would expect EFD’s performing much better on sequential load as well. Better performance doesn’t come withouth a bigger price tag. EFD disks are still over 20 times more expensive per TB than SAS disks but then again SAS disks are about 3 times more expensive per IO than EFD disks.

Now if only EFDs could be used as cache on VNXe.

Disclaimer

These results reflect the performance of the environment that the tests were ran in. Results may vary depending on the hardware and how the environment is configured.


EFD vs. FC Pools


Our CX4 with Flare30 has been in production for about six months now and we decided to add some more FAST Cache on it. It currently has two mirrored 100GB EFDs configured as FAST cache and we just got two new 100GB disks to be added to the cache. We’ve also been pondering if we should add EFDs to the current pools for databases. Before adding the two new disks to the cache I wanted to make some performance tests on the EFDs. I also wanted to compare the EFD performance with the performance of the current pools that we have in production.

The focus of these tests was to see if the EFDs would have the desired performance advantage against the current pools that we already have in use. Like I mentioned we already have 100GB FAST cache in use and it is also enabled on the pools that I used to run these tests.

I used Iometer to generate the load and to gather the results. In the past I’ve done Iometer tests with storage arrays that are not in any other use. In those cases I’ve used iometer setup described in VMware’s Recommendations for Aligning VMFS Partitions document. Using those settings to run Iometer tests would have been time consuming and would have also generated huge load on the production CX. Now that I was only focusing to compare the simulated database load on different disk configurations I decided to run the test with only one transfer request size.

While I was creating the disks for the test I decided to add a couple of more disks and run some additional tests. I was curious to see how a properly alligned disk would really perform compared to an unaligned one and also what kind of performance difference there was between VMFS and RAW-disks. Yes I know that the VMware’s document I mentioned above already proves that an aligned disk performs better than an unaligned. I just wanted to know what was the case in our environment.

Test environment [list edited 9/7/2011]

  • CX4-240 with 91GB FAST Cache
  • Dell PE M710HD Blade server
  • 2x Dell PowerConnect switches
  • Two 10G iSCSI NICs with total of four paths between storage and ESXi. Round robin path selection policy enabled for each lun with two I/O active paths.
  • ESXi 4.1U1
  • Virtual Win 2003 SE SP2 (1vCPU and 2GB memory)

Disk Configurations

  • 15 FC Disk RAID5 Pool with FAST Cache enabled
    • 50GB LUN for VMFS partition
      • 20GB unaligned virtual disk (POOL_1_vmfs_u)
      • 20GB aligned virtual disk (POOL_1_vmfs_a)
    • 20GB LUN for unaligned RAW disk (POOL_1_raw_u)
    • 20GB LLUN for aligned RAW disk (POOL_1_raw_a)
  • 25 FC Disk RAID5 Pool with FAST Cache enabled
    • 50GB LUN for VMFS partition
      • 20GB unaligned virtual disk (POOL_2_vmfs_u)
      • 20GB aligned virtual disk (POOL_2_vmfs_a)
    • 20GB LUN for unaligned RAW disk  (POOL_2_raw_u)
    • 20GB LLUN for aligned RAW disk  (POOL_1_raw_a)
  • 2 EFD Disk RAID1 RAID Group
    • 50GB LUN for VMFS partition
      • 20GB unaligned virtual disk (EFD_vmfs_u)
      • 20GB aligned virtual disk (EFD_vmfs_a)
    • 20GB LUN for unaligned RAW disk (EFD_raw_u)
    • 20GB LLUN for aligned RAW disk (EFD_raw_a)

Raw disks were configured to use physical compatibility mode on ESXi.

Unaligned disks were configured using Windows Disk Management and formatted using default values.

Partitions to aligned disks were created using diskpart command ‘create partition primary align=1024′ and partition were formatted with a 32K allocation size.

Iometer configuration

  • 1 Worker
  • 8KB transfer request size
  • Read/write ratio of 66/34 and 100% random distribution
  • 8 outstanding I/Os per target
  • 4 minute run time
  • 60 sec ramp-up time

Results

Each Iometer test to specific disk was repeated three times. Results are the average of these three runs. Keep in mind that the array was running over 100 production VMs during the tests, so these results are not absolute.

Conclusions

When comparing the results on unaligned disks and aligned disks there are no huge differences. Although POOL_1_raw_u and POOL_2_vmfs_u results kind of jump out from those charts. I did three more test runs for those disks and still got the same results. This might have something to do with the production load that we are having on the CX.

Also the performance differences between raw disks and disks on vmfs were not major, but still noticeable, i.e. the difference on IOps between POOL_2_vmfs_a and POOL_2_raw_a is over 200. EFD raw is also giving about 200 more IOps than EFD vmfs.

Let’s get to the point. The whole purpose of these tests was to compare FC pool and EFD performance. If you haven’t noticed from the graphs the difference is HUGE! Do I even have to say more? I think the graphs have spoken. Those 5000+ IOps was achieved only with two EFDs. Think about having a whole array full of those.

After these tests my suggestion is to use VMFS datastores instead of raw disks. But there are still some cases that you might need to use raw disks with virtual machines, i.e. when having a physical/virtual cluster. Aligning Windows Server disks is not a big thing anymore because Windows Server 2008 does that automatically. If you have some old Windows Server 2003 installations I would suggest you to check if the disks are aligned or not. There is a Microsoft KB that describes how to check disk alignment. If the server disks are not aligned you might want to start planning to move your data to aligned disks. What comes to the EFDs the performance gained using those is self-evident. EFDs are still a bit expensive. But think about the price of the arrays and the disks needed for the same IOps than what the EFDs can provide. In some cases you need to think more about the price per IO than price per GB.

Disclaimer

These results reflect the performance of the environment that the tests were ran. Results may vary depending on the hardware and how the environment is configured.


%d bloggers like this: