Tag Archives: EMC

VNXe MR2 and Unisphere Remote


EMC released a new Operation Environment (MR2 – 2.2.0.17142) for the VNXe on March 16. At the same time a new product called Unisphere Remote was released. I have already upgraded several VNXes to the latest OE and also installed the Unisphere Remote appliance and decided to do a quick review of both.

New features and fixed bugs

It’s about three months since I wrote about my hands on experience with the VNXe 3100. On that post I covered some findings that I did during the implementation. One big issue that I found was that in certain circumstances datastores would be disconnected when changing the VNXe MTU settings. I reported this to EMC and I was pleased reading the release notes of the MR2 and noticing that this issue had been fixed on it. So about two months after I reported the bug a new OE was released where the bug was fixed.

There are several other bugs fixed on the latest OE but there are also some very interesting new features:

  • Unisphere Remote to manage multiple VNXes from single management console
  • EMC Secure Remote Support (ESRS)
  • Enhanced iSCSI replication workflow
  • Extended VNXe Unisphere CLI functionality: Support for Hyper-V storage resources and CIFS without AD integration.
  • VNXe and VMware Management features: VAAI for NFS and more NFS features for vCenter.

Upgrading to the new OE

Upgrading the VNXe to the latest OE follows the same form that I have described in my earlier posts and the upgrade will take about an hour. During the upgrade I could see high latency on the datastores when the other SP was rebooted and the datastores moved to the other SP. Good to keep that in mind when planning the upgrade.

The first place after the upgrade where the new OE can be spotted is the login page. Unisphere version has changed from 1.6 to 1.7.

After logging in the first time after the upgrade post-upgrade configuration wizard will be shown and will help to configure the new features. I skipped the wizard and started exploring the UI.

The first new feature will be found from Settings > More configurations

Also relating to ESRS some new System Information fields have been added in Settings > Management Settings where detailed information about the device can be inserted to help EMC identify the device and contact the administrator.

Unisphere Remote

MR2 also changed the Settings > Management Settings page and a new Network tab has been added where Unisphere Remote Configuration can now be found. This is also one of the other new features on the latest MR2.

To start using Unisphere Remote a virtual appliance is also needed. That can be downloaded from EMC support site. After the virtual appliance deployment the Unisphere Remote is ready to be used.

It looks and feels just like a VNXe and it’s very easy to get around with. The only thing missing now are the VNXes. I already introduced the configuration for VNXe and the needed information for VNXe can be found under Settings on Unisphere Remote.

Adding IP address, Server Hash and Challenge Pharse on the VNXe are the only required configurations for the VNXe. Couple of minutes after adding these settings the VNXe will be visible on Unisphere Remote.

The Dashboard page is fully customizable by moving the current widgets or adding new widgets. Also new tabs can be added and customized by adding widgets on them.

By default only five devices are shown on the Dashboard but that can be changed from widget settings to 10 or 20 depending of the widget.

What is the benefit of Unisphere Remote then? It basically gives a quick overview of your VNXes showing the most/least utilized VNXe by CPU and also by capacity depending how the dashboard is configured. Unisphere Remote only gathers data that can be viewed from a single UI but the devices connected to it cannot be managed from the Remote. However, there are links to open an individual VNXe Unisphere from the Remote. So with a proper LDAP configuration VNXe management through Unisphere Remote will be seamless which is a huge benefit when managing tens or hundreds of VNXes.

Conclusions

After running the latest OE for a couple of days now on production I’m very happy with it, even before running as intensive tests as I have with the other versions. EMC seems to have put a lot of effort on this version. They have fixed some major issues and also added some great new features.

From the new features that I listed above I’ve mostly been concentrating on the Unisphere Remote and after using it only for a couple of days I can already see the benefit of it. Only by looking at the description of the ESRS I can also see the benefit of that as well.

Advertisements

VNXe 3100 performance


Earlier this year I installed a VNXe 3100 and have now done some testing with it. I have already covered the VNXe 3300 performance in a couple of my previous posts: Hands-on with VNXe 3300 Part 6: Performance and VNXe 3300 performance follow up (EFDs and RR settings). The 3100 has fewer disks than the 3300, also less memory and only two I/O ports. So I wanted to see how the 3100 would perform compared to the 3300. I ran the same Iometer tests that I ran on the 3300. In this post I will compare those results to the ones that I introduced in the previous posts. The environment is a bit different so I will quickly describe that before presenting the results.

Test environment

  • EMC VNXe 3100 (21 600GB SAS Drives)
  • Dell PE 2900 server
  • HP ProCurve 2510G
  • Two 1Gb iSCSI NICs
  • ESXi 4.1U1 / ESXi 5.0
  • Virtual Win 2008 R2 (1vCPU and 4GB memory)

Test results

I ran the tests on both ESXi 4.1 and ESXi 5.0 but the iSCSI results were very similar so I used the average of both. NFS results had some differences so I will present the results for both 4 and 5 separately. I also did the tests with and without LAG and also when changing the default RR settings. VNXe was configured with one 20 disk pool with 100GB datastore provisioned to ESXi servers. The tests were run on 20GB virtual disk on the 100GB datastore.

[update] My main focus in these tests has been on iSCSI because that is what we are planning to use. I only ran quick tests with the generic NFS and not with the one that is configured under Storage – VMware. After Paul’s comment I ran a couple of test on the “VMware NFS” and I then added “ESXi 4 VMware NFS” to the test results:

Conclusions

With default settings the performance of the 3300 and the 3100 is fairly similar. The 3300 gives better throughput when the default IO operation limit is set from the default 1000 to 1. The differences on the physical configurations might also have an effect on this. With random workload the performance is similar even when the default settings are changed. Of course the real difference would be seen when both would be under heavy load. During the tests there was only the test server running on the VNXes.

On the NFS I didn’t have comparable results from the 3300. I ran different tests on the 3300 and those results weren’t good either. The odd thing is that ESXi 4 and ESXi 5 gave quite different results when running the tests on NFS.

Looking these and the previous results I would still be sticking with iSCSI on VNXe. What comes to the performance of the 3100 it is surprisingly close to its bigger sibling 3300.

[update] Looking at the new test results NFS is performing as well as iSCSI. With the modified RR settings iSCSI gets better max throughput but then again with random workloads NFS seems to perform better. So the type of NFS storage provisioned to the ESX hosts makes a difference. Now comes the question NFS or iSCSI? Performance vice either one is a good choice. But which one suits your environment better?

Disclaimer

These results reflect the performance of the environment that the tests were ran in. Results may vary depending on the hardware and how the environment is configured.


Ask The Expert wrap up


It has now been almost two weeks since the EMC Ask the Expert: VNXe front-end Networks with VMware event ended. We had a couple of meetings before hand where we discussed and planned the event, but we really didn’t know what to expect from it. Matt and I were committed to answer the questions during the two weeks so it was a bit different than a normal community thread. Now looking at the amount of views the discussion got we know that it was a success. During the two weeks of time that the event was active we had more than 2300 views on the page. We had several people asking questions and opinions from us. As a summary Matt and I wrote a document that covers the main concerns discussed during the event. In this document we look into the VNXe HA configurations, link aggregation and also do a quick overview of the ESX side configurations:

Ask the Expert Wrap ups – for the community, by the community

I was really excited when I was asked to participate a great event like this. Thank you Mark, Matt and Sean, it was great working with you guys!


EMC Ask The Expert


You may have already visited the EMC Support Community Ask The Expert Forum page or read posts about it by Matthew Brender, Mark Browne or Sean Thulin. EMC Ask The Expert Series is basically engagement between customers, partners and EMC employees or whoever wants to participate. The series consists of several topics and there are also several ways to take part (i.e. online webinar, forum conversation).

Like Matt, Mark and Sean have already mentioned on their posts the first Ask The Expert event started already on the January 16 and is running till  January 26. The first event is about VNXe network configurations and troubleshooting. Matthew and I have already been answering questions for a bit over a week and will continue until the end of this week. Just as I was writing this post we passed 1500 views on the topic.

How is this different from any other EMC support forum topic?

Both Matt and I are committed to monitor and answer this Ask The Expert topic for the period of two weeks. We will both get email alerts whenever someone posts on the topic and we will try to answer the questions during the same day. Matt will be answering as an EMC employee and I will be answering as a customer.

The topic is about VNXe networking but it doesn’t mean that you can’t ask questions about other topics concerning VNXe. The topic is set to keep the thread fairly short. If other than networking questions are raised we will start new topic on the forum and continue the conversation in that thread.

There are still four full days to take an advantage of my and Matt’s knowledge about VNXe. The event ends on Friday but that doesn’t mean we are not answering any VNXe related questions in the forums anymore. It means that after Friday you might not get your questions answered as quickly as you would get during this event while both of us are committed to interact with this topic.

I would encourage anyone to ask questions or raise concerns about any VNXe topic on the EMC support forums. If you don’t have ECN (EMC Community Network) account I would recommend creating one and interacting if you are working with EMC products. If you are EMC customer and have Powerlink account you can login to ECN using that account:

If you have a question about VNXe and for some reason don’t want to post it on the ECN forum just leave a comment on this post and I will address the question on Ask The Expert thread. We are also monitoring #EMCAskTheExpert tag on Twitter and will pick questions from there too.


Changing MTU on VNXe disconnects datastores from ESXi


While testing the VNXe 3100 (OE 2.1.3.16008) I found a problem when changing the MTU settings for link aggregate. With specific combination of configurations, changing the MTU causes the ESXi (4.1.0, 502767) to loose all iSCSI datastores and even changing the settings back the datastores are still not visible on ESXi. VNXe also can’t provision new datastores to ESXi while this problem is occurring. There are a couple of workarounds for this but no official fix is available to avoid this kind of a situation.

How did I find it?

After the initial configuration I created link aggregate from two ports, set the MTU to 9000 and also created one iSCSI server on SP A with two IP addresses. I then configured ESXi also to use MTU 9000. Datastore creation on VNXe side went through successfully but on the ESXi side I could see an error that the VMFS volume couldn’t be created.

I could see the LUN under iSCSI adapter but manually creating a VMFS datastore also failed. I then realized that I hadn’t configured jumbo frames on the switch and decided to change the ESXi and VNXe MTUs back to 1500. After I changed the VNXe MTU the LUN disappeared from ESXi. Manually changing the datastore access settings from VNXe didn’t help either. I just couldn’t get the ESXi see the LUN anymore. I then tried to provision a new datastore to ESX but got this error:

Ok, so I deleted the datastore and the iSCSI server and then recreated the iSCSI server and provisioned a new datastore for the ESXi without any problems. I had a suspicion that the MTU change caused the problem and tried it again. I changed the link aggregation on VNXe from 1500 to 9000 and after that was done the datastore disappeared from ESXi. Changing MTU back to 1500 didn’t help, the datastore and LUN were not visible on ESX. Also creating a new datastore gave the same error as before. Datastore was created on VNXe but was not accessible from ESXi. Deleting and recreating datastores and iSCSI servers resolved the issue again.

What is the cause of this problem?

So it seemed that the MTU change was causing the problem. I started testing with different scenarios and found out that the problem was the combination of the MTU change and also the iSCSI server having two IP addresses. Here are some scenarios that I tested (sorry about the rough grammar, tried to keep the descriptions short):

Link aggregation MTU 1500 and iSCSI server with two IP addresses. Provisioned storage works on ESXi. Changing VNXe link aggregation MTU to 9000 and ESXi lose connection to datastore. Change VNXe MTU back to 1500 and ESXi still can’t see the datastore. Trying to provision new datastore to ESXi results an error. Removing the other IP address doesn’t resolve the problem.

Ling aggregation MTU 1500 and iSCSI server with two IP addresses. Provisioned storage works on ESXi. Removing the other IP from iSCSI server and changing MTU to 9000. Datastore is still visible and accessible from ESXi side. Changing MTU back to 1500 and datastore is still visible and accessible from ESXi. Datastore provisioning to ESXi is successful. After adding another IP address to iSCSI server ESX loses the connection to datastore. Provisioning new datastore to ESXi results an error. Removing the other IP address also doesn’t resolve the problem.

Ling aggregation MTU 1500 and iSCSI server with one IP address. Provisioned storage works on ESX. Change MTU to 9000. Datastore is still visible and accessible from ESXi side. Changing MTU back to 1500 and datastore is still visible and accessible from ESXi. Datastore provisioning to ESXi is successful. After adding another IP address to iSCSI server ESX loses the connection to datastore. Provisioning new datastore to ESXi results an error. Removing the other IP doesn’t resolve the problem.

Link aggregation MTU 1500 and two iSCSI servers on one SP both configured with one IP. One datastore on both iSCSI servers (there is also an issue getting the datastore on the other iSCSI server provisioned, see my previous post). Adding a second IP for the first iSCSI server and both datastores are still accessible from ESXi. When changing MTU to 9000 ESX loses connection to both datastores. Changing MTU back to 1500 and both datastores are still not visible on ESXi. Also getting the same error as previously when trying to provision new storage.

I also tested different combinations with iSCSI servers on different SPs and if SPA iSCSI server has two IP addresses and SPB iSCSI server has only one IP and the MTU is changed then the datastores on SPB iSCSI server are not affected.

How to fix this?

Currently there is no official fix for this. I have reported the problem to EMC support and demonstrated the issue to EMC support technician and uploaded all the logs, so they are working on trying to find the root cause of this.

Apparently when an iSCSI server has two IP addresses and the MTU is changed the iSCSI server goes to some kind of “lockdown” mode and doesn’t allow any connections to be initiated. Like I already described the VNXe can be returned to operational state by removing all datastores and iSCSI servers and recreating them. Of course this is not an option when there is production data on the datastores.

EMC support technician showed me a quicker and a less radical workaround to get the array back to operational state: Restarting the iSCSI service on the VNXe. CAUTION: Restarting iSCSI service will disconnect all provisioned datastores from hosts. Connection to datastores will be established after the iSCSI service is restarted. But this will cause all running VMs to crash.

The easiest way to restart iSCSI service is enabling the iSNS server from iSCSI server settings, giving it an IP address and applying changes. After the changes are applied iSNS server can be disabled. This will trigger the iSCSI service to restart and all datastores that were disconnected are again visible and usable on ESXi.

Conclusions

After this finding I would suggest not to configure iSCSI serves with two IP addresses. If MTU change can do this much damage what about other changes?

If you have two iSCSI servers with two IP addresses I would advise not to change MTU even if it would be done during a planned service break. If for some reason it is mandatory to do the change, contact EMC support before doing it. If you have arrays affected by this issue I would encourage to contact EMC support before trying to restart the iSCSI service.

Once again I have to give credit to EMC support. They have some great people working there.


Hands-on with VNXe 3100


On Friday before Christmas we ordered VNXe 3100 with 21 600GB SAS drives and it was delivered in less than two weeks. Exactly two weeks after the order was placed we had the first virtual machine running on it.

Last year I made a seven post series about my hands-on experience with VNXe 3300. This is my first VNXe 3100 that I’m working with and also my first VNXe unboxing. With the 3300s I relied on my colleagues to do all the physical installation because I was 5000 miles away from the datacenter. With this one I did everything by myself from unboxing to installing the first VM.

My previous posts are still valid so in this post I’ll try to concentrate on the differences between the 3300 and 3100. Will Huber has really good posts on unboxing and configuring the VNXe 3100. During the installation I also found a couple of problems from the latest OE (2.1.3.16008). I will describe one of the issues in this post and for the bigger thing I will do a separate post.

I will also do a follow up post about the performance differences between 3300 and 3100 when I get all the tests done. I’m also planning to do some tests with FusionIO IOturbine and I will post those results when I get the card and the tests done.

Initial setup

VNXe and the additional DAE came in two pretty heavy boxes. Which box to open first? Well the box that you need to open first tells you that:

So like a kid on Christmas day I opened the boxes and the first thing that I see is this big poster explaining the installation procedure. The rack rails are easy and quick to install. The arrays are quite heavy but managed to lift those on the rack also by myself.

After doing all the cabling it was time to power on the VNXe. Before doing this you need to decide how you are going to do the initial configuration (assigning IP). In my previous post I mentioned that there are two options for doing it using VNXe ConnectionUtility: auto discovery or manual configuration. With the manual configuration the VNXe ConnectionUtility basically creates a text file on the USB stick that will be inserted into the VNXe before the first boot. A faster way is to skip the download and installation of the 57mb package and create the file manually on a USB stick. So get a USB stick, create IW_CONF.txt file on the USB stick and add the following content to it replacing the [abcd] variables with your own:

TYPE=CONFIGURE

PROTO_VERSION=1

FRIENDLYNAME=[VNXENAME]

MGMTADDRESSA1=[IP ADDRESS]

MGMTMASK1=[NETMASK]

GATEWAY=[GATEWAY]

After that just insert the USB into the VNXe and power it on. The whole process of unboxing, cabling and powering on took me about one and a half hours.

While the VNXe was starting up I downloaded the latest Operating Environment (2.1.3.16008) so that I was ready to run the upgrade after the system was up and running. After the first login ‘Unisphere Configuration Wizard’ will show up and you need to go through several steps. I skipped some of those (creating iscsi server, creating storage pool, licensing) and started the upgrade process (see my previous post).

After the upgrade was done I logged back in and saw the prompt about the license. I clicked the “obtain license” button and a new browser window opened and following the instructions I got the license file. I’ve heard complaints about IE and the licensing page not working. The issue might be browser popup blocker. It is also stated on the license page that the popup blocker should be disabled.

After this quick LACP trunk configuration on the switch and ESXi side iSCSI configurations I was ready to provision some storage and do some testing.

Issues that I found out

During the testing I found an issue with the MTU settings when using iSCSI. A problem that will cause datastores to be disconnected from ESXi. Even reverting back to the original MTU settings the datastores can’t be connected to ESXi and new datastores can’t be created. I will describe this in a separate post.

The other issue that I found was more cosmetic. When having two iSCSI servers on the same SP and provisioning the first VMware storage on the second iSCSI server Unisphere will give the following error:

For some reason VNXe can’t initiate the VMFS datastore creation process on ESXi. But LUN is still visible on ESXi and VMFS datastore can be manually created. So it’s not a big issue but still annoying.

Conclusions

There seems to be small improvements to the latest operation environment (2.1.3.16008). Provisioning storage to ESX server feels a bit faster and also VMFS datastore is also created every time if having only one iSCSI server on SP. In the previous OE the VMFS datastore was created about 50% of the time when storage was provisioned to ESXi.

In the previous posts I have mentioned how easy and simple VNXe is to configure and the 3100 is no different from the 3300 from that point of view. Overall VNXe 3100 seems to be a really good product considering the fairly low price of it. A quick look at the performance tests shows quite similar results to the ones that I got from the 3300. I will do a separate post about the performance comparison of these two VNXes.

Although it is good to keep in mind the difference between marketing material and the reality. VNXe 3300 is advertised to have 12GB memory per SP but in reality it has only 256MB read cache and 1GB write cache. 3100 is advertised to have 8GB memory but it has only 128MB read cache and 768MB write cache.


VNXe 3300 performance follow up (EFDs and RR settings)


On my previous post about VNXe 3300 performance I introduced results from the performance tests I had done with VNXe 3300. I will use those results as a comparison for the new tests that I ran recently. In this post I will compare the performance difference with different Round Robin policy settings. I also had a chance to test the performance of EFD disks on VNXe.

Round Robin settings

On the previous post all the tests were ran on default RR settings which means that ESX would send 1000 commands through one path before changing the path. I observed that with the default RR settings I was only getting the bandwidth of one link on the four port LACP trunk. I got some feedback from Ken advising to change the default RR IO operation limit setting from 1000 to 1 to get two links worth of bandwidth from VNXe. So I wanted to test what kind of an effect would this change have on performance.

Arnim van Lieshout has a really good post about configuring RR using PowerCLI and I used his examples for configuring the IO operation limit from 1000 to 1. If you are not confident running the full PowerCLI scripts Arnim introduced in his post here is how RR settings for individual device could be changed using GUI and couple of simple PowerCLI commands:

1. Change datastore path selection policy to RR (from vSphere client – select host – configure – storage adapters – iSCSI sw adapter – right click the device and select manage paths – for path selection select Round Robin (VMware) and click change)

2. Open PowerCLI and connect to the server

Connect-VIServer -Server [servername]

3. Retrieve esxcli instance

$esxcli = Get-EsxCli

4. Change device IO Operation Limit to 1 and set Limit Type to Iops. [deviceidentifier] can be found from vSphere client’s iSCSI sw adapter view and is in format of naa.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.

$esxcli.nmp.roundrobin.setconfig($null,”[deviceidentifier]”,1,”iops”,$null)

5. Check that the changes were completed.

$esxcli.nmp.roundrobin.getconfig(“[deviceidentifier]”)

Results 1

For these tests I used same environment and Iometer settings that I described on my Hands-on with VNXe 3300 Part 6: Performance post.

Results 2

For these tests I used the same environment except instead of virtual Win 2003 I used virtual Win 2008 (1vCPU and 4GB memory) and the following Iometer settings (I picked up these settings from VMware Community post Open unofficial storage performance thread):

Max Throughput-100%Read

  • 1 Worker
  • 8000000 sectors max disk size
  • 64 outstanding I/Os per target
  • 500 transactions per connection
  • 32KB transfer request size
  • 100% sequential distribution
  • 100% Read distribution
  • 5 minute run time

Max Throughput-50%Read

  • 1 Worker
  • 8000000 sectors max disk size
  • 64 outstanding I/Os per target
  • 500 transactions per connection
  • 32KB transfer request size
  • 100% sequential distribution
  • 50% read/write distribution
  • 5 minute run time

RealLife-60%Rand-65%Read

  • 1 Worker
  • 8000000 sectors max disk size
  • 64 outstanding I/Os per target
  • 500 transactions per connection
  • 8KB transfer request size
  • 40% sequential / 60% random distribution
  • 35 % read /65% write distribution
  • 5 minute run time

Random-8k-70%Read

  • 1 Worker
  • 8000000 sectors max disk size
  • 64 outstanding I/Os per target
  • 500 transactions per connection
  • 8KB transfer request size
  • 100% random distribution
  • 30 % read /70% write distribution
  • 5 minute run time
[Updated 11/29/11] After I had published this post Andy Banta gave me a hint on twitter:
You might squeeze more by using a number like 5 to 10, skipping
some of the path change cost.
So I ran couple of more tests changing the IO operation limit between 5-10. With the 28 disk pool there was no big difference when used the values 1 or 5-10. With the EFDs the magic number seemed to be 6 and with that I managed to get 16 MBps and 1100 IOps more out of the disks with specific work loads. I added the new EFD results to the graphs. 


Conclusions

Changing the default RR policy setting from the default 1000 io to 1 io really makes a difference on VNXe. On random workload there is not much difference between these two settings. But with sequential workload the difference is significant. Sequental write IOps and throughput is more than double with certain block sizes when using the 1 io setting. If you have ESXs connected to VNXe with LACP trunk I would recommend changing the RR policy to 1 5-10. Like I already mentioned Arnim has a really good post about configuring RR settings using PowerCLI. Another good post about multipathing is A “Multivendor Post” on using iSCSI with VMware vSphere by Chad Sakac.

Looking at the results it is obvious that EFD disks perform much better than SAS disks. On sequential workload 28 Disk SAS pool’s performance is about the same as 5 disk EFD RG’s. But on random workload EFD’s performance is about two times better than SAS pool’s. There was no other load on the disks while these tests were ran so under additional load I would expect EFD’s performing much better on sequential load as well. Better performance doesn’t come withouth a bigger price tag. EFD disks are still over 20 times more expensive per TB than SAS disks but then again SAS disks are about 3 times more expensive per IO than EFD disks.

Now if only EFDs could be used as cache on VNXe.

Disclaimer

These results reflect the performance of the environment that the tests were ran in. Results may vary depending on the hardware and how the environment is configured.


%d bloggers like this: