Tag Archives: VNXe

VNXe OE 2.4 and 2.5″ form factor disks


Once again I had a chance to play around with some shiny new hardware. And once again the hardware was VNXe 3300 but this time it was something that I hadn’t seen before: 2.5” form factor with 46 600GB 10k disks. If you have read about the new RAID configurations in OE  2.4.0 you might figure out what kind of configuration I have in my mind with this HW.

In this post I will go through some of the new features introduced in VNXe OE 2.4.0, do some configuration comparisons between 3.5” and 2.5” form factors and also between VNXe and VNX. Of course I had to do some performance testing as well with the new RAID configurations so I will introduce the results later in this post.

VNXe OE 2.4.0.20932 release notes

Customizable Dashboard

Along with the new OE came the ability to customize UI dashboard. The look of the Unisphere UI on new or upgraded VNXe is now similar to Unisphere Remote. You can customize the dashboard and also create new tabs and add desired view blocks to the tabs.

VNXe dashboard

vnxe_dashboard1

Jobs

Some of the operations are now added as background jobs and you don’t have to wait that the operation is finished. Steps of the operations are also more detailed when viewed from the jobs page. Number of active jobs is also shown next to the alerts on the status bar dependent on what page are you on.

jobs

New RAID configurations

Now this is one of the enhancements that I’ve been waiting for because VNXe can only utilize four RAID groups in a pool. So with the previous OE this would mean that datastore in 6+1 RAID 5 pool could only utilize 28 disks. Now with the 10+1 RAID 5 pool structure datastores can utilize as many as 44 disks. This also means increased max iops per datastore. 3.5” form factor 15k disk RAID 5 pool max iops is increased from ~4900 to ~7700 and with 2.5” form factor 10k disk RAID 5 pool max iops is increased from ~3500 to ~5500. Iops is not the only thing to be looked at. Size of the pool matters too and not to forget the rack space that the VNXe will use. While I was sizing the last VNXe that we ordered I made this comparison chart to compare the pool size, iops and rack space with different disk form factors in VNX and VNXe.

comparison

Interesting setup with the VNXe 3150 and 2.5” form factor disks is the 21TB and 5500 iops packed in 4U rack space. VNXe 3300 with same specs would take 5U space and VNX5300 would take 6U space. Of course the SP performance is a bit different between these arrays but so is the price.

Performance

I’ve already posted some performance test results from VNX 3100 and 3300 so I added those results to the charts for comparison. I’ve also ran some tests on VNX 5300 that I haven’t posted yet and also added those results on the charts.

avgmbps1

avgmbps2

avgiops1

avgiops2

avglatency1

avglatency2

There is a significant difference in the max throughput between 1G and 10G modules on VNXe. Then again the real life test results are quite similar.

Disclaimer

These results reflect the performance of the environment that the tests were ran in. Results may vary depending on the hardware and how the environment is configured.


Hidden VNXe performance statistics revised


In my previous post covering the VNXe hidden statistics I explained where to find the “hidden” statistics files and how to extract the data into usable format. Now it seems that EMC has changed the statistics gathering interval from 5 minutes to 30 minutes. I started playing around with the new data and created a spreadsheet template that generates graphs for IOps and MB/s for the past 2 months, 1 month, 2 weeks, 1 week and 24 hours. In this post I will share the template and also explain how to use it.

Exporting data from SQLite database and importing it into the spreadsheet

The first two steps are explained in more detail in my previous post

  • Get stats_basic_summary.db and old.stats_basic_summary.db files from VNXe
  • Export data from those files to stats_basic_summary.txt and old.stats_basic_summary.txt files.
  • Download the spreadsheet template from here. (I have tested it with OpenOffice and Microsoft Excel and it works better with Excel. I was planning to use Google Docs but there is just too much data on the spreadsheet so it didn’t work.)
  • Import stats_basic_summary.txt content into the spreadsheets stats_basic_summary.db sheet starting from A1 and using Delimited data type and tab + comma delimiters

  • Import old.stats_basic_summary.txt content into the spreadsheets old.stats_basic_summary.db sheet starting from A1 and using Delimited data type and tab + comma delimiters

If you now go to the “2 months” sheet the graphs might look like this:

Removing zeros from the statistics

Sometimes VNXe seems to fail updating the statistics to the database and only zeros are added. When using the data without taking the zeros out it will produce graphs as shown above. Rows 43-2772 on the imported sheets are used for the statistics and it is important to find all zero rows to get the graphs working properly.

Data from the previous row should be copied to replace the zeros from column D onward. All the statistics values seem to be running numbers and for the graphs the latter value is deducted from the previous value. So replacing the zeros with the value from previous row will make the particular timestamp to be zero on the graph.

After all the zeros are replaced in most cases the statistics and graphs will show the correct values.

Performance counters reset during the data gathering

If the statistics and graphs are still not showing the correct values after removing the zeros from the data the issue might be that the performance counters were reset during the data gathering period. When this happens there might be a row of zeros before the reset.

To fix this the zero row should be replaced as described above and the row below the zeros should be deleted.

Disclaimer

This method for gathering and presenting the statistics is not approved or confirmed by EMC. This is something I have found and it seems to work in the environments I work with. So the statistics might not be accurate.


EMC World 2012: Hands-On Labs


Hands-On Labs (HOL) are always on my priority list when attending conferences or local EMC/VMware forums. Recorded breakout sessions can be viewed after the conference but HOLs are not available afterwards, at least not yet. The HOL setup was similar to last VMworld HOLs: most of the HOLs were running on virtual appliances and accessed using zero/thin clients.

VNXe Labs

There were two VNXe Hands-on labs available:

VNXe Unisphere Administrator

Remote Monitoring of Multiple VNXe Systems

I did the first HOL where the objectives were to create CIFS server/share and also generic iSCSI server/datastore and then connect those to Windows VM. For someone who has been working with CIFS and generic iSCSI servers this might already be a familiar topic. But for someone who has only been working with vSphere datastores on VNXe this was a good introduction to CIFS and the generic iSCSI side of the VNXe.

While I was at the lab I had a quick chat with Mike Gore from EMC who is responsible of the VNXe labs at the EMC World. I asked him why there weren’t any VNXe labs focusing on the vSphere side and he mentioned that those could be available in future events and that the current labs are more like an introduction to VNXe.

Unisphere Analyzer Evaluating FAST Cache and FAST-VP on VNX

I’ve been working with CLARiiONs the past 8 years so Navisphere, Unisphere and also Analyzer have become very familiar to me. I still wanted to do this HOL and see if there was something that could help me in the future when digging into analyzer statistics. It was a very good lab for refreshing memory and also to give some new hints what to look for in analyzer.

ProShpere Storage Resource Management

This was the most interesting HOL that I took. I’ve been looking into ProSphere after it was released but never had a chance to test it in our environment. Like I mentioned earlier I’ve been using Unisphere Analyzer to dig in to the CLARiiON performance statistics but it is really hard to see the overall performance using analyzer. So ProSphere gives a great overall view of the environment including host, storage path and storage performance. I’m definitely going to use this in the near future.

RecoverPoint

I’ve been using MirrorView also several years now and wanted to see what RecoverPoint would offer compared to MirrorView. And the answer is simple: a lot more. Of course when comparing these two it is good to first evaluate the data protection needs. RecoverPoint might be a bit overkill just to replicate one VMware datastore and would not be the most cost efficient way to do it. But it was a very useful lab and gave me a good overview of RecoverPoint and what it could be used for.

One can use several hours viewing demos and reading documents but in my opinion hands on experience is the best way to learn new things. So once again EMC succeeded delivering a good number of very well executed hands-on labs. Big thanks to the vSpeclialists and other crew members who made the HOLs possible. I hope I can attend more HOLs in the future events.

Check out also Chad’s post about the HOLs.


VNXe 3150 highlights


VNXe 3150 was announced at EMC World and here are some highlights:

  • 2U 25 drive arrays with 2.5″ drives
  • Max 100 drives
  • Supports flash and 3TB NL-SAS drives
  • 10GB I/O modules available
  • Quad core processor

VNXe 3150 is expected to ship 2nd half of 2012


VNXe document updates


Along with the operating environment version 2.2 upgrade there were several documents added or updated on the EMC Support page. The documents can be found from Support by product –  VNXe Series – Documentation. Here are links to some of the documents:

VNXe Unisphere CLI User Guide

Using a VNXe System with VMware

Using a VNXe System with Microsoft Exchange

Using a VNXe System with Generic iSCSI Storage

Using a VNXe System with Microsoft Windows Hyper-V

Using an EMC VNXe System with CIFS Shared Folders

Using an EMC VNXe System with NFS Shared Folders

VNXe Security Configuration Guide

Couple of previously published useful documents:

White Paper: EMC VNXe High Availability

VNXe Service Commands

Check out the EMC Support page for other updated documents.


VNXe MR2 and Unisphere Remote


EMC released a new Operation Environment (MR2 – 2.2.0.17142) for the VNXe on March 16. At the same time a new product called Unisphere Remote was released. I have already upgraded several VNXes to the latest OE and also installed the Unisphere Remote appliance and decided to do a quick review of both.

New features and fixed bugs

It’s about three months since I wrote about my hands on experience with the VNXe 3100. On that post I covered some findings that I did during the implementation. One big issue that I found was that in certain circumstances datastores would be disconnected when changing the VNXe MTU settings. I reported this to EMC and I was pleased reading the release notes of the MR2 and noticing that this issue had been fixed on it. So about two months after I reported the bug a new OE was released where the bug was fixed.

There are several other bugs fixed on the latest OE but there are also some very interesting new features:

  • Unisphere Remote to manage multiple VNXes from single management console
  • EMC Secure Remote Support (ESRS)
  • Enhanced iSCSI replication workflow
  • Extended VNXe Unisphere CLI functionality: Support for Hyper-V storage resources and CIFS without AD integration.
  • VNXe and VMware Management features: VAAI for NFS and more NFS features for vCenter.

Upgrading to the new OE

Upgrading the VNXe to the latest OE follows the same form that I have described in my earlier posts and the upgrade will take about an hour. During the upgrade I could see high latency on the datastores when the other SP was rebooted and the datastores moved to the other SP. Good to keep that in mind when planning the upgrade.

The first place after the upgrade where the new OE can be spotted is the login page. Unisphere version has changed from 1.6 to 1.7.

After logging in the first time after the upgrade post-upgrade configuration wizard will be shown and will help to configure the new features. I skipped the wizard and started exploring the UI.

The first new feature will be found from Settings > More configurations

Also relating to ESRS some new System Information fields have been added in Settings > Management Settings where detailed information about the device can be inserted to help EMC identify the device and contact the administrator.

Unisphere Remote

MR2 also changed the Settings > Management Settings page and a new Network tab has been added where Unisphere Remote Configuration can now be found. This is also one of the other new features on the latest MR2.

To start using Unisphere Remote a virtual appliance is also needed. That can be downloaded from EMC support site. After the virtual appliance deployment the Unisphere Remote is ready to be used.

It looks and feels just like a VNXe and it’s very easy to get around with. The only thing missing now are the VNXes. I already introduced the configuration for VNXe and the needed information for VNXe can be found under Settings on Unisphere Remote.

Adding IP address, Server Hash and Challenge Pharse on the VNXe are the only required configurations for the VNXe. Couple of minutes after adding these settings the VNXe will be visible on Unisphere Remote.

The Dashboard page is fully customizable by moving the current widgets or adding new widgets. Also new tabs can be added and customized by adding widgets on them.

By default only five devices are shown on the Dashboard but that can be changed from widget settings to 10 or 20 depending of the widget.

What is the benefit of Unisphere Remote then? It basically gives a quick overview of your VNXes showing the most/least utilized VNXe by CPU and also by capacity depending how the dashboard is configured. Unisphere Remote only gathers data that can be viewed from a single UI but the devices connected to it cannot be managed from the Remote. However, there are links to open an individual VNXe Unisphere from the Remote. So with a proper LDAP configuration VNXe management through Unisphere Remote will be seamless which is a huge benefit when managing tens or hundreds of VNXes.

Conclusions

After running the latest OE for a couple of days now on production I’m very happy with it, even before running as intensive tests as I have with the other versions. EMC seems to have put a lot of effort on this version. They have fixed some major issues and also added some great new features.

From the new features that I listed above I’ve mostly been concentrating on the Unisphere Remote and after using it only for a couple of days I can already see the benefit of it. Only by looking at the description of the ESRS I can also see the benefit of that as well.


VNXe 3100 performance


Earlier this year I installed a VNXe 3100 and have now done some testing with it. I have already covered the VNXe 3300 performance in a couple of my previous posts: Hands-on with VNXe 3300 Part 6: Performance and VNXe 3300 performance follow up (EFDs and RR settings). The 3100 has fewer disks than the 3300, also less memory and only two I/O ports. So I wanted to see how the 3100 would perform compared to the 3300. I ran the same Iometer tests that I ran on the 3300. In this post I will compare those results to the ones that I introduced in the previous posts. The environment is a bit different so I will quickly describe that before presenting the results.

Test environment

  • EMC VNXe 3100 (21 600GB SAS Drives)
  • Dell PE 2900 server
  • HP ProCurve 2510G
  • Two 1Gb iSCSI NICs
  • ESXi 4.1U1 / ESXi 5.0
  • Virtual Win 2008 R2 (1vCPU and 4GB memory)

Test results

I ran the tests on both ESXi 4.1 and ESXi 5.0 but the iSCSI results were very similar so I used the average of both. NFS results had some differences so I will present the results for both 4 and 5 separately. I also did the tests with and without LAG and also when changing the default RR settings. VNXe was configured with one 20 disk pool with 100GB datastore provisioned to ESXi servers. The tests were run on 20GB virtual disk on the 100GB datastore.

[update] My main focus in these tests has been on iSCSI because that is what we are planning to use. I only ran quick tests with the generic NFS and not with the one that is configured under Storage – VMware. After Paul’s comment I ran a couple of test on the “VMware NFS” and I then added “ESXi 4 VMware NFS” to the test results:

Conclusions

With default settings the performance of the 3300 and the 3100 is fairly similar. The 3300 gives better throughput when the default IO operation limit is set from the default 1000 to 1. The differences on the physical configurations might also have an effect on this. With random workload the performance is similar even when the default settings are changed. Of course the real difference would be seen when both would be under heavy load. During the tests there was only the test server running on the VNXes.

On the NFS I didn’t have comparable results from the 3300. I ran different tests on the 3300 and those results weren’t good either. The odd thing is that ESXi 4 and ESXi 5 gave quite different results when running the tests on NFS.

Looking these and the previous results I would still be sticking with iSCSI on VNXe. What comes to the performance of the 3100 it is surprisingly close to its bigger sibling 3300.

[update] Looking at the new test results NFS is performing as well as iSCSI. With the modified RR settings iSCSI gets better max throughput but then again with random workloads NFS seems to perform better. So the type of NFS storage provisioned to the ESX hosts makes a difference. Now comes the question NFS or iSCSI? Performance vice either one is a good choice. But which one suits your environment better?

Disclaimer

These results reflect the performance of the environment that the tests were ran in. Results may vary depending on the hardware and how the environment is configured.


Ask The Expert wrap up


It has now been almost two weeks since the EMC Ask the Expert: VNXe front-end Networks with VMware event ended. We had a couple of meetings before hand where we discussed and planned the event, but we really didn’t know what to expect from it. Matt and I were committed to answer the questions during the two weeks so it was a bit different than a normal community thread. Now looking at the amount of views the discussion got we know that it was a success. During the two weeks of time that the event was active we had more than 2300 views on the page. We had several people asking questions and opinions from us. As a summary Matt and I wrote a document that covers the main concerns discussed during the event. In this document we look into the VNXe HA configurations, link aggregation and also do a quick overview of the ESX side configurations:

Ask the Expert Wrap ups – for the community, by the community

I was really excited when I was asked to participate a great event like this. Thank you Mark, Matt and Sean, it was great working with you guys!


EMC Ask The Expert


You may have already visited the EMC Support Community Ask The Expert Forum page or read posts about it by Matthew Brender, Mark Browne or Sean Thulin. EMC Ask The Expert Series is basically engagement between customers, partners and EMC employees or whoever wants to participate. The series consists of several topics and there are also several ways to take part (i.e. online webinar, forum conversation).

Like Matt, Mark and Sean have already mentioned on their posts the first Ask The Expert event started already on the January 16 and is running till  January 26. The first event is about VNXe network configurations and troubleshooting. Matthew and I have already been answering questions for a bit over a week and will continue until the end of this week. Just as I was writing this post we passed 1500 views on the topic.

How is this different from any other EMC support forum topic?

Both Matt and I are committed to monitor and answer this Ask The Expert topic for the period of two weeks. We will both get email alerts whenever someone posts on the topic and we will try to answer the questions during the same day. Matt will be answering as an EMC employee and I will be answering as a customer.

The topic is about VNXe networking but it doesn’t mean that you can’t ask questions about other topics concerning VNXe. The topic is set to keep the thread fairly short. If other than networking questions are raised we will start new topic on the forum and continue the conversation in that thread.

There are still four full days to take an advantage of my and Matt’s knowledge about VNXe. The event ends on Friday but that doesn’t mean we are not answering any VNXe related questions in the forums anymore. It means that after Friday you might not get your questions answered as quickly as you would get during this event while both of us are committed to interact with this topic.

I would encourage anyone to ask questions or raise concerns about any VNXe topic on the EMC support forums. If you don’t have ECN (EMC Community Network) account I would recommend creating one and interacting if you are working with EMC products. If you are EMC customer and have Powerlink account you can login to ECN using that account:

If you have a question about VNXe and for some reason don’t want to post it on the ECN forum just leave a comment on this post and I will address the question on Ask The Expert thread. We are also monitoring #EMCAskTheExpert tag on Twitter and will pick questions from there too.


Changing MTU on VNXe disconnects datastores from ESXi


While testing the VNXe 3100 (OE 2.1.3.16008) I found a problem when changing the MTU settings for link aggregate. With specific combination of configurations, changing the MTU causes the ESXi (4.1.0, 502767) to loose all iSCSI datastores and even changing the settings back the datastores are still not visible on ESXi. VNXe also can’t provision new datastores to ESXi while this problem is occurring. There are a couple of workarounds for this but no official fix is available to avoid this kind of a situation.

How did I find it?

After the initial configuration I created link aggregate from two ports, set the MTU to 9000 and also created one iSCSI server on SP A with two IP addresses. I then configured ESXi also to use MTU 9000. Datastore creation on VNXe side went through successfully but on the ESXi side I could see an error that the VMFS volume couldn’t be created.

I could see the LUN under iSCSI adapter but manually creating a VMFS datastore also failed. I then realized that I hadn’t configured jumbo frames on the switch and decided to change the ESXi and VNXe MTUs back to 1500. After I changed the VNXe MTU the LUN disappeared from ESXi. Manually changing the datastore access settings from VNXe didn’t help either. I just couldn’t get the ESXi see the LUN anymore. I then tried to provision a new datastore to ESX but got this error:

Ok, so I deleted the datastore and the iSCSI server and then recreated the iSCSI server and provisioned a new datastore for the ESXi without any problems. I had a suspicion that the MTU change caused the problem and tried it again. I changed the link aggregation on VNXe from 1500 to 9000 and after that was done the datastore disappeared from ESXi. Changing MTU back to 1500 didn’t help, the datastore and LUN were not visible on ESX. Also creating a new datastore gave the same error as before. Datastore was created on VNXe but was not accessible from ESXi. Deleting and recreating datastores and iSCSI servers resolved the issue again.

What is the cause of this problem?

So it seemed that the MTU change was causing the problem. I started testing with different scenarios and found out that the problem was the combination of the MTU change and also the iSCSI server having two IP addresses. Here are some scenarios that I tested (sorry about the rough grammar, tried to keep the descriptions short):

Link aggregation MTU 1500 and iSCSI server with two IP addresses. Provisioned storage works on ESXi. Changing VNXe link aggregation MTU to 9000 and ESXi lose connection to datastore. Change VNXe MTU back to 1500 and ESXi still can’t see the datastore. Trying to provision new datastore to ESXi results an error. Removing the other IP address doesn’t resolve the problem.

Ling aggregation MTU 1500 and iSCSI server with two IP addresses. Provisioned storage works on ESXi. Removing the other IP from iSCSI server and changing MTU to 9000. Datastore is still visible and accessible from ESXi side. Changing MTU back to 1500 and datastore is still visible and accessible from ESXi. Datastore provisioning to ESXi is successful. After adding another IP address to iSCSI server ESX loses the connection to datastore. Provisioning new datastore to ESXi results an error. Removing the other IP address also doesn’t resolve the problem.

Ling aggregation MTU 1500 and iSCSI server with one IP address. Provisioned storage works on ESX. Change MTU to 9000. Datastore is still visible and accessible from ESXi side. Changing MTU back to 1500 and datastore is still visible and accessible from ESXi. Datastore provisioning to ESXi is successful. After adding another IP address to iSCSI server ESX loses the connection to datastore. Provisioning new datastore to ESXi results an error. Removing the other IP doesn’t resolve the problem.

Link aggregation MTU 1500 and two iSCSI servers on one SP both configured with one IP. One datastore on both iSCSI servers (there is also an issue getting the datastore on the other iSCSI server provisioned, see my previous post). Adding a second IP for the first iSCSI server and both datastores are still accessible from ESXi. When changing MTU to 9000 ESX loses connection to both datastores. Changing MTU back to 1500 and both datastores are still not visible on ESXi. Also getting the same error as previously when trying to provision new storage.

I also tested different combinations with iSCSI servers on different SPs and if SPA iSCSI server has two IP addresses and SPB iSCSI server has only one IP and the MTU is changed then the datastores on SPB iSCSI server are not affected.

How to fix this?

Currently there is no official fix for this. I have reported the problem to EMC support and demonstrated the issue to EMC support technician and uploaded all the logs, so they are working on trying to find the root cause of this.

Apparently when an iSCSI server has two IP addresses and the MTU is changed the iSCSI server goes to some kind of “lockdown” mode and doesn’t allow any connections to be initiated. Like I already described the VNXe can be returned to operational state by removing all datastores and iSCSI servers and recreating them. Of course this is not an option when there is production data on the datastores.

EMC support technician showed me a quicker and a less radical workaround to get the array back to operational state: Restarting the iSCSI service on the VNXe. CAUTION: Restarting iSCSI service will disconnect all provisioned datastores from hosts. Connection to datastores will be established after the iSCSI service is restarted. But this will cause all running VMs to crash.

The easiest way to restart iSCSI service is enabling the iSNS server from iSCSI server settings, giving it an IP address and applying changes. After the changes are applied iSNS server can be disabled. This will trigger the iSCSI service to restart and all datastores that were disconnected are again visible and usable on ESXi.

Conclusions

After this finding I would suggest not to configure iSCSI serves with two IP addresses. If MTU change can do this much damage what about other changes?

If you have two iSCSI servers with two IP addresses I would advise not to change MTU even if it would be done during a planned service break. If for some reason it is mandatory to do the change, contact EMC support before doing it. If you have arrays affected by this issue I would encourage to contact EMC support before trying to restart the iSCSI service.

Once again I have to give credit to EMC support. They have some great people working there.


Follow

Get every new post delivered to your Inbox.

Join 42 other followers

%d bloggers like this: