There’s not many tools available specifically for MetroCluster but I’ve added the ones I found below. If anyone knows of any others please let me know and i’ll update this post.
The FMC_DC can be downloads from here -> http://mysupport.netapp.com/NOW/download/tools/FMC_DC/. It will require a NetApp NOW account.
The FMC_DC is the Fabric Metro Cluster Data Collector which can be configured to gather information on all components (controllers, switches, bridges etc.) of the MetroCluster infrastructure. Once the components have been added a health check can be run. This health check appears as a card on the application and will show whether the components are healthy or need further investigation.
I’d recommend having a look over this document to get started with FMC_DC
While the FMC_DC doesn’t provide any management features it does provide peace of mind that all components are configured so that failover can be successful. If you’re doing a DR test I’d definitely recommend using it.
These are some of the things to look out for with MetroCluster and can be considered best practices and recommendations.
One very important configuration change to be done on MetroCluster controllers is to immediately disable the change_fsid option. If it is not disabled the all volumes and LUNs will be renamed during failover and make it impossible to volumes and LUNs to be referenced. This is really critical for LUNs.
To avoid the FSID change in the case of a site takeover, you can set the change_fsid option to off (the default is on). Setting this option to off has the following results if a site takeover is initiated by the cf forcetakeover -d command:
- Data ONTAP refrains from changing the FSIDs of volumes and aggregates.
- Users can continue to access their volumes after site takeover without remounting.
- LUNs remain online.
If you don’t disable the change_fsid option in MetroCluster configurations the following happens when the cf forcetakeover -d command is run:
- Data ONTAP changes the file system IDs (FSIDs) of volumes and aggregates because ownership changes.
- Because of the FSID change, clients must remount their volumes if a takeover occurs.
- If using Logical Units (LUNs), the LUNs must also be brought back online after the takeover.
options cf.takeover.change_fsid off
MetroCluster RC file
What is Fabric-Attached MetroCluster?
A Fabric-Attached MetroCluster configuration can be implemented for distances greater than 500 meters connects the two storage nodes by using four Brocade or Cisco Fibre Channel switches in a dual-fabric configuration for redundancy. Each site has two Fibre Channel switches, each of which is connected through an inter-switch link to a partner switch at the other site.
The inter-switch links are fibre connections which extend the storage fabric path so that it provides a greater distance between nodes than other HA pair solutions. By using four switches instead of two, redundancy is in place to avoid single-points-of-failure in the switches and their connections.
The advantages of a fabric-attached MetroCluster configuration over a stretch MetroCluster configuration include the following:
- Increased disaster protection via nodes being in separate geographical locations
- Disk shelves and nodes are not connected directly to each other, but are connected to a fabric with multiple data routes ensuring no single point of failure.
The disadvantage is that there’s more cabling and there’s more components involved in the way of fibre switches.
Fabric MetroCluster requirements