VMware Metro Storage Cluster
VMware Metro Storage Cluster (vMSC) allows vCenter to stretch across two data centers in geographically dispersed locations. In normal circumstances, in vSphere 5.5 and below at least, vCenter would be deployed in Link-Mode so two vCenters can be managed as one. However, with vMSC it’s possible to have one vCenter manage all resources across two sites and leverage the underlying stretch storage and networking infrastructures. I’ve done previous blogs on NetApp MetroCluster to describe how a stretched storage cluster is spread across two disparate data centers. I’d also recommend reading a previous post done on vMSC by Paul Meehan over on www.virtualizationsoftware.com. The idea behind this post is to provide the VMware view for the MetroCluster posts and to give a better idea on how MetroCluster storage links into virtualization environments.
The main benefit of a stretched cluster is that it enables workload and resource balancing across datacenters. This helps companies to reach almost zero RTO and RPOs and ensure uptime of critical systems as workloads can be migrated easing using vMotion and Storage vMotion. One thing to keep in mind regarding vMSC, it’s not really sold as a disaster recover solution but rather a disaster avoidance solution when linked with the underlying storage. Some of the other benefits of a stretched cluster are:
- Workload mobility
- Cross-site automated load balancing
- Enhanced downtime avoidance
- Disaster avoidance
- System uptime and high availability
There are a number of storage vendors that provide the back-end storage required for a vMSC to work. I won’t go into the entire list but you can find out more on the VMware Compatibility Matrix site. The one that I have experience with is NetApp MetroCluster but I know of others from EMC and Hitachi at least. So what components make up a vMSC? It comes down to an extended layer 2 network across data centers so that vMotions can take place with ease and also a resilient storage platform connected to ESXi via VMFS or NFS datastores. VMware vCenter itself does need some configuration changes but it’s nothing outside the scope of what a regular VMware admin can implement. A view of what a vMSC looks like is below. The networking and storage components have been simplified.
From this you can see that you have essentially double the infrastructure that you would have under normal circumstances as you have to account for failover requirements.
Virtualization host requirements:
- Need enough capacity on both sites to ensure that in the event of a failure on one site all the VMs and resources can be picked up by the remaining site.
- Recommended to regularly perform capacity management and forecasting of the environments needs to ensure sufficient capacity exists
- The downside to this is that when you buy new hosts you should buy two, one for each site. While it’s not necessary to have both sides balanced with the same number of ESXi hosts it is strongly encouraged. And as the admin of that infrastructure it will make your life easier. I would also be surprised that if a company splashed out for the existing infrastructure that they’d balk at having to purchase double the number of ESXi hosts when you need further compute resources.
- vSphere HA admission control and DRS affinity rules need to be configured to cater for the needs of vMSC, this is discussed further below.
- Storage connectivity using Fibre Channel, iSCSI, SVD and FCoE is supported.
- The maximum supported network latency between sites for the ESXi management networks is 10ms round-trip time (RTT).
- The maximum supported latency for synchronous storage replication links is 5ms RTT.
- A minimum of 622Mbps network bandwidth, configured with redundant links, is required for the ESXi vMotion network.
- The storage requirements are slightly more complex. A VMware vMSC requires what is in effect a single storage subsystem that spans both sites. In this design, a given datastore must be accessible (able to be read and written to) simultaneously from both sites
- The storage subsystem for a VMware vMSC must be able to be read from and write to the two locations simultaneously. All disk writes are committed synchronously at the two locations to ensure that data is always consistent regardless of the location from which it is being read. This storage architecture requires significant bandwidth and very low latency between the sites involved in the cluster
- Stretched layer 2 network across two data centers. If this is across two sites then in most cases this will be enabled via OTV
- Resilient paths for networking to ensure loss of one leg doesn’t impact the vCenter services
- Networking rules in place to allow hosts on each site to be able to access storage on opposite site
- Resilient switches on each site, in this case 4 if there are two sites, to ensure switch failure doesn’t impact availability.
The networking element is often overlooked but is one of the more complex components of vMSC. For the storage the network requirement is usually dark fiber between sites and for the front-end network used by vCenter the requirement is to have OTV enabled layer-2 encapsulation over layer-3. So what is OTV? OTV is an overlay protocol that encapsulates layer-2 segments into layer-3 packets so that two sites can communicate on the same subnets and IPs as if they were on the same site. OTV can be implemented using high-end switches such as Cisco Nexus 7000. If you’re running a Flexpod environment this is the recommended switch and for most people it will also act as their core switch. Using the Nexus 7000 you can utilise the concept of VDCs (Virtual Device Contexts) to provide flexible separation on hardware resources and to separate the data and control planes. Created a new VDC for OTV ensures that only OTV traffic is handled by that VDC and there is no bleeding across VDCs.
Each site had an OTV VDC on each of the Nexus 7000 switches and VLANs that need to travel across sites need to be configured within the OTV VDC. Normal network activity on both sites would run in the LAN VDC. When traffic needs to cross sites such as running a vMotion from one site to the other the request is routed via the OTV VDC which encapsulates the packets and transfers to and OTV on the other site which in turn un-encapsulates and routes the traffic back through the LAN VDC on that site to complete the request.
vSphere HA configuration
vSphere HA is what keeps the VMs alive in the event that there’s an ESXi host failure. For vMSC VMware recommends enabling vSphere HA Admission Control and setting it to 50% for CPU and 50% for RAM. This offers the most flexibility and reduces operational overheads. Coming from environments where vMSC was not in place this seemed a little bit of an overkill but it means that even if new hosts are added to the environment you don’t need to change the admission control settings to match.
vSphere HA heartbeat is also recommended to be enabled to include datastore heartbeating. The minimum requirement for datastore heartbeat is 2 and the max is 5 datastores. It’s critical for datastore heartbeating to be enabled for vMSC as the front end link could be lost but the backend storage network is still available so everything is kept alive
VMware DRS configuration
VMware DRS affinity rules to enable a logical separation of virtual machines to ensure that storage and network traffic are load balanced. It also helps keep VMs in the site where they experience the most use. For example, applications on VMs that are primarily used in Site 1 can be configured to have affinity to that site. In the event of a failure they will be managed by HA and when services have been resumed the VMs automatically move back to Site 1 based on the DRS affinity rules. Keeping the VM near to where the primary reads/writes are required also improves the storage performance.
VMware Storage Metro Cluster fits a very specific use-case and requires some very complex infrastructure to be in place to operate successfully. It enables fantastic system availability and while the underlying storage and networking infrastructures are complex the vCenter view appears no different to what the admin is used to and no extra skillset is required for the operations team to manage the infrastructure. Using VAAI plug-ins on the ESXi host offloads some of the heavy lifting to the storage and using VSC (virtual storage console from NetApp) will mean the admin can also manage provisioning and potentially backups via the vCenter console.