Tag Archives: IOps

VNXe OE 2.4 and 2.5″ form factor disks

Once again I had a chance to play around with some shiny new hardware. And once again the hardware was VNXe 3300 but this time it was something that I hadn’t seen before: 2.5” form factor with 46 600GB 10k disks. If you have read about the new RAID configurations in OE  2.4.0 you might figure out what kind of configuration I have in my mind with this HW.

In this post I will go through some of the new features introduced in VNXe OE 2.4.0, do some configuration comparisons between 3.5” and 2.5” form factors and also between VNXe and VNX. Of course I had to do some performance testing as well with the new RAID configurations so I will introduce the results later in this post.

VNXe OE release notes

Customizable Dashboard

Along with the new OE came the ability to customize UI dashboard. The look of the Unisphere UI on new or upgraded VNXe is now similar to Unisphere Remote. You can customize the dashboard and also create new tabs and add desired view blocks to the tabs.

VNXe dashboard



Some of the operations are now added as background jobs and you don’t have to wait that the operation is finished. Steps of the operations are also more detailed when viewed from the jobs page. Number of active jobs is also shown next to the alerts on the status bar dependent on what page are you on.


New RAID configurations

Now this is one of the enhancements that I’ve been waiting for because VNXe can only utilize four RAID groups in a pool. So with the previous OE this would mean that datastore in 6+1 RAID 5 pool could only utilize 28 disks. Now with the 10+1 RAID 5 pool structure datastores can utilize as many as 44 disks. This also means increased max iops per datastore. 3.5” form factor 15k disk RAID 5 pool max iops is increased from ~4900 to ~7700 and with 2.5” form factor 10k disk RAID 5 pool max iops is increased from ~3500 to ~5500. Iops is not the only thing to be looked at. Size of the pool matters too and not to forget the rack space that the VNXe will use. While I was sizing the last VNXe that we ordered I made this comparison chart to compare the pool size, iops and rack space with different disk form factors in VNX and VNXe.


Interesting setup with the VNXe 3150 and 2.5” form factor disks is the 21TB and 5500 iops packed in 4U rack space. VNXe 3300 with same specs would take 5U space and VNX5300 would take 6U space. Of course the SP performance is a bit different between these arrays but so is the price.


I’ve already posted some performance test results from VNX 3100 and 3300 so I added those results to the charts for comparison. I’ve also ran some tests on VNX 5300 that I haven’t posted yet and also added those results on the charts.







There is a significant difference in the max throughput between 1G and 10G modules on VNXe. Then again the real life test results are quite similar.


These results reflect the performance of the environment that the tests were ran in. Results may vary depending on the hardware and how the environment is configured.

Hidden VNXe performance statistics revised

In my previous post covering the VNXe hidden statistics I explained where to find the “hidden” statistics files and how to extract the data into usable format. Now it seems that EMC has changed the statistics gathering interval from 5 minutes to 30 minutes. I started playing around with the new data and created a spreadsheet template that generates graphs for IOps and MB/s for the past 2 months, 1 month, 2 weeks, 1 week and 24 hours. In this post I will share the template and also explain how to use it.

Exporting data from SQLite database and importing it into the spreadsheet

The first two steps are explained in more detail in my previous post

  • Get stats_basic_summary.db and old.stats_basic_summary.db files from VNXe
  • Export data from those files to stats_basic_summary.txt and old.stats_basic_summary.txt files.
  • Download the spreadsheet template from here. (I have tested it with OpenOffice and Microsoft Excel and it works better with Excel. I was planning to use Google Docs but there is just too much data on the spreadsheet so it didn’t work.)
  • Import stats_basic_summary.txt content into the spreadsheets stats_basic_summary.db sheet starting from A1 and using Delimited data type and tab + comma delimiters

  • Import old.stats_basic_summary.txt content into the spreadsheets old.stats_basic_summary.db sheet starting from A1 and using Delimited data type and tab + comma delimiters

If you now go to the “2 months” sheet the graphs might look like this:

Removing zeros from the statistics

Sometimes VNXe seems to fail updating the statistics to the database and only zeros are added. When using the data without taking the zeros out it will produce graphs as shown above. Rows 43-2772 on the imported sheets are used for the statistics and it is important to find all zero rows to get the graphs working properly.

Data from the previous row should be copied to replace the zeros from column D onward. All the statistics values seem to be running numbers and for the graphs the latter value is deducted from the previous value. So replacing the zeros with the value from previous row will make the particular timestamp to be zero on the graph.

After all the zeros are replaced in most cases the statistics and graphs will show the correct values.

Performance counters reset during the data gathering

If the statistics and graphs are still not showing the correct values after removing the zeros from the data the issue might be that the performance counters were reset during the data gathering period. When this happens there might be a row of zeros before the reset.

To fix this the zero row should be replaced as described above and the row below the zeros should be deleted.


This method for gathering and presenting the statistics is not approved or confirmed by EMC. This is something I have found and it seems to work in the environments I work with. So the statistics might not be accurate.

Hidden VNXe performance statistics

The latest Operating Environment upgrades have already brought some improvements to the statistics that are shown through the Unisphere GUI. The first VNXe OE that I worked with was showing only CPU statistics. Then along with update 2.1.0 Network Activity and Volume Activity statistics came available. I was still hoping to get some more statistics. IOps and latency graphs would have been nice additions. So I did some digging and found out that there is actually lots of statistics parameters that VNXe gathers but those are just stored in the database, maybe for support purposes.

Where is the data stored?

When logging in to the VNXe via SSH using service account and listing the content of the folder /EMC/backend/perf_stats you will see that there are several db-files in that folder.

Now when opening the file with notepad it is quite clear what kind of databases those are:

How to read the data?

Now that we know that the data is stored in SQLite database the next thing is to export the data to readable format. To do this SQLite shell is needed. SQLite is really simple to use, just download shell and run a couple of commands.

To open the database, to select the output file and to export all the data can all be done with using only three commands:

Now all the content of the database is exported to stats_basic_summary.txt. Data can now be imported to spreadsheet or to another database.

What data is stored in the databases?

Actually there is a lot of parameters and data in those databases. Here is just few of the parameters.

DART parameters in stats_basic_default.db:


DART parameters in stats_basic_summary.db:


FLARE_SP parameters in stats_basic_summary.db:


How can that data be used?

I take the StoreReadRequests parameter from stats_basic_default.db as an example. Some of the parameters have descriptions and this is one of those:

Total number of read requests on all DART volumes

Here is the format that the data is in after imported to spreadsheet:

There is a time stamp and also a value for the StoreReadRequests. It seems that the number of read requests that were recorded during the five minute period is added to the old value and then inserted as a new entry to the database. So basically subtracting the the earlier value from the new one we get the total number of read requests for all DART volumes for the specific five minute period of time:

4267021177 – 4266973002 = 48175

Now if we divide that result with 300 (seconds) we get the average number of read requests on all DART volumes per second during the specific five minute period:

48175 / 300 = 160.58

With some spreadsheet magic it is easy to create a nice “requests per second” graph from the data:

How can I be sure that my theory is correct?

Well, NetBasicBytesIn and NetBasicBytesOut parameter values in the stats_basic_default.db are also growing with every time stamp. These are also defined in the database: Total Bytes DART received/sent from all NICs. So I used the same math to do a graph showing network statistics for the past 24 hours. I then compared that graph with the Unisphere’s network activity graph and those were matching.

The graph that I put together using the values from the database and the formula  introduced earlier:

Unisphere network activity graph:


I really hope that EMC will bring more statistics to the GUI or introduce a way to export the data to readable format a bit easier. From what I’ve heard Clint Kitson from EMC has already wrote some scripts for pulling the stats from VNXe but it is not yet published for the customers. Digging into the databases is kind of a quick and dirty way to get more statistics out of the VNXe, but it seems to be working.

VNXe 3300 performance follow up (EFDs and RR settings)

On my previous post about VNXe 3300 performance I introduced results from the performance tests I had done with VNXe 3300. I will use those results as a comparison for the new tests that I ran recently. In this post I will compare the performance difference with different Round Robin policy settings. I also had a chance to test the performance of EFD disks on VNXe.

Round Robin settings

On the previous post all the tests were ran on default RR settings which means that ESX would send 1000 commands through one path before changing the path. I observed that with the default RR settings I was only getting the bandwidth of one link on the four port LACP trunk. I got some feedback from Ken advising to change the default RR IO operation limit setting from 1000 to 1 to get two links worth of bandwidth from VNXe. So I wanted to test what kind of an effect would this change have on performance.

Arnim van Lieshout has a really good post about configuring RR using PowerCLI and I used his examples for configuring the IO operation limit from 1000 to 1. If you are not confident running the full PowerCLI scripts Arnim introduced in his post here is how RR settings for individual device could be changed using GUI and couple of simple PowerCLI commands:

1. Change datastore path selection policy to RR (from vSphere client – select host – configure – storage adapters – iSCSI sw adapter – right click the device and select manage paths – for path selection select Round Robin (VMware) and click change)

2. Open PowerCLI and connect to the server

Connect-VIServer -Server [servername]

3. Retrieve esxcli instance

$esxcli = Get-EsxCli

4. Change device IO Operation Limit to 1 and set Limit Type to Iops. [deviceidentifier] can be found from vSphere client’s iSCSI sw adapter view and is in format of naa.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.


5. Check that the changes were completed.


Results 1

For these tests I used same environment and Iometer settings that I described on my Hands-on with VNXe 3300 Part 6: Performance post.

Results 2

For these tests I used the same environment except instead of virtual Win 2003 I used virtual Win 2008 (1vCPU and 4GB memory) and the following Iometer settings (I picked up these settings from VMware Community post Open unofficial storage performance thread):

Max Throughput-100%Read

  • 1 Worker
  • 8000000 sectors max disk size
  • 64 outstanding I/Os per target
  • 500 transactions per connection
  • 32KB transfer request size
  • 100% sequential distribution
  • 100% Read distribution
  • 5 minute run time

Max Throughput-50%Read

  • 1 Worker
  • 8000000 sectors max disk size
  • 64 outstanding I/Os per target
  • 500 transactions per connection
  • 32KB transfer request size
  • 100% sequential distribution
  • 50% read/write distribution
  • 5 minute run time


  • 1 Worker
  • 8000000 sectors max disk size
  • 64 outstanding I/Os per target
  • 500 transactions per connection
  • 8KB transfer request size
  • 40% sequential / 60% random distribution
  • 35 % read /65% write distribution
  • 5 minute run time


  • 1 Worker
  • 8000000 sectors max disk size
  • 64 outstanding I/Os per target
  • 500 transactions per connection
  • 8KB transfer request size
  • 100% random distribution
  • 30 % read /70% write distribution
  • 5 minute run time
[Updated 11/29/11] After I had published this post Andy Banta gave me a hint on twitter:
You might squeeze more by using a number like 5 to 10, skipping
some of the path change cost.
So I ran couple of more tests changing the IO operation limit between 5-10. With the 28 disk pool there was no big difference when used the values 1 or 5-10. With the EFDs the magic number seemed to be 6 and with that I managed to get 16 MBps and 1100 IOps more out of the disks with specific work loads. I added the new EFD results to the graphs. 


Changing the default RR policy setting from the default 1000 io to 1 io really makes a difference on VNXe. On random workload there is not much difference between these two settings. But with sequential workload the difference is significant. Sequental write IOps and throughput is more than double with certain block sizes when using the 1 io setting. If you have ESXs connected to VNXe with LACP trunk I would recommend changing the RR policy to 1 5-10. Like I already mentioned Arnim has a really good post about configuring RR settings using PowerCLI. Another good post about multipathing is A “Multivendor Post” on using iSCSI with VMware vSphere by Chad Sakac.

Looking at the results it is obvious that EFD disks perform much better than SAS disks. On sequential workload 28 Disk SAS pool’s performance is about the same as 5 disk EFD RG’s. But on random workload EFD’s performance is about two times better than SAS pool’s. There was no other load on the disks while these tests were ran so under additional load I would expect EFD’s performing much better on sequential load as well. Better performance doesn’t come withouth a bigger price tag. EFD disks are still over 20 times more expensive per TB than SAS disks but then again SAS disks are about 3 times more expensive per IO than EFD disks.

Now if only EFDs could be used as cache on VNXe.


These results reflect the performance of the environment that the tests were ran in. Results may vary depending on the hardware and how the environment is configured.

%d bloggers like this: