Hidden VNXe performance statistics


The latest Operating Environment upgrades have already brought some improvements to the statistics that are shown through the Unisphere GUI. The first VNXe OE that I worked with was showing only CPU statistics. Then along with update 2.1.0 Network Activity and Volume Activity statistics came available. I was still hoping to get some more statistics. IOps and latency graphs would have been nice additions. So I did some digging and found out that there is actually lots of statistics parameters that VNXe gathers but those are just stored in the database, maybe for support purposes.

Where is the data stored?

When logging in to the VNXe via SSH using service account and listing the content of the folder /EMC/backend/perf_stats you will see that there are several db-files in that folder.

Now when opening the file with notepad it is quite clear what kind of databases those are:

How to read the data?

Now that we know that the data is stored in SQLite database the next thing is to export the data to readable format. To do this SQLite shell is needed. SQLite is really simple to use, just download shell and run a couple of commands.

To open the database, to select the output file and to export all the data can all be done with using only three commands:

Now all the content of the database is exported to stats_basic_summary.txt. Data can now be imported to spreadsheet or to another database.

What data is stored in the databases?

Actually there is a lot of parameters and data in those databases. Here is just few of the parameters.

DART parameters in stats_basic_default.db:

SysClockUnixms
NetBasicBytesIn
NetBasicBytesOut
NetInPackets
NetOutPackets
TCPInPackets
TCPOutPackets
UDPInPackets
UDPOutPackets
StoreReadBytes
StoreWriteBytes
StoreReadRequests
StoreWriteRequests

DART parameters in stats_basic_summary.db:

NetBasicBytesIn
NetBasicBytesOut
NetInPackets
NetOutPackets
TCPInPackets
TCPOutPackets
UDPInPackets
UDPOutPackets
StoreWriteBytes
StoreReadBytes
StoreReadRequests
StoreWriteRequests
KernelBufCacheHits
kernelBufCacheLookups
CifsActiveConnections
CifsTotalConnections
CifsBasicReadBytes
CifsBasicReadOpCount
CifsBasicWriteBytes
CifsBasicWriteOpCount
FsDnlcHits
FsDnlctotal
FsOfCachehits
FsOfCachetotal
NfsActiveConnections
NfsBasicReadBytes
NfsBasicReadOpCount
NfsBasicWriteBytes
NfsBasicWriteOpCount
iSCSIBasicReads
iSCSIReadBytes
iSCSIBasicWrites
iSCSIWriteBytes

FLARE_SP parameters in stats_basic_summary.db:

HardErrorCount
HighWaterMarkFlushOff
IdleFlushOn
LowWaterMarkFlushOff
writeCacheFlushes
writeCacheBlocksFlushed
ReadHitRatio
SPTimestamp
SumOfQueueLengths
arrivalsToNonzeroQueue
SumOfLUNBlkRead
SumOfLUNBlkWrite
SumOfLUNDiskRead
SumOfLUNDIskWrite
SumOfLUNDiskBlkRead
SumOfLUNDiskBlkWrite
SumOfFRUBlkRead
SumOfFRUBlkWrite
SumOfFRUReadCount
SumOfFRUWriteCount

How can that data be used?

I take the StoreReadRequests parameter from stats_basic_default.db as an example. Some of the parameters have descriptions and this is one of those:

Total number of read requests on all DART volumes

Here is the format that the data is in after imported to spreadsheet:

There is a time stamp and also a value for the StoreReadRequests. It seems that the number of read requests that were recorded during the five minute period is added to the old value and then inserted as a new entry to the database. So basically subtracting the the earlier value from the new one we get the total number of read requests for all DART volumes for the specific five minute period of time:

4267021177 – 4266973002 = 48175

Now if we divide that result with 300 (seconds) we get the average number of read requests on all DART volumes per second during the specific five minute period:

48175 / 300 = 160.58

With some spreadsheet magic it is easy to create a nice “requests per second” graph from the data:

How can I be sure that my theory is correct?

Well, NetBasicBytesIn and NetBasicBytesOut parameter values in the stats_basic_default.db are also growing with every time stamp. These are also defined in the database: Total Bytes DART received/sent from all NICs. So I used the same math to do a graph showing network statistics for the past 24 hours. I then compared that graph with the Unisphere’s network activity graph and those were matching.

The graph that I put together using the values from the database and the formula  introduced earlier:

Unisphere network activity graph:

Conclusions

I really hope that EMC will bring more statistics to the GUI or introduce a way to export the data to readable format a bit easier. From what I’ve heard Clint Kitson from EMC has already wrote some scripts for pulling the stats from VNXe but it is not yet published for the customers. Digging into the databases is kind of a quick and dirty way to get more statistics out of the VNXe, but it seems to be working.

Advertisement

6 responses to “Hidden VNXe performance statistics

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: