During a recent upgrade I found that one of the ESXi hosts just would not update using Update Manager. The error I was seeing was “Cannot run upgrade script on host”.
After a bit of searching I found this article which related to ESXi 5.1 upgrade to 5.5 but the steps worked well to fix the issue I was seeing.
In order to fix the issue I performed the following steps:
Step 1: Disable HA for the cluster
Step 2: Go to vCenter Networking. Select the distributed vswitch and then select the hosts tab. From here, right-click on the host you need to reboot and select Remove from vSphere Distributed Switch
Click Yes to remove the host from the switch.
Step 3: Remove the host from the cluster
Step 4: Enter the host into maintenance mode and then choose to reboot.
Step 5: Connect via SSH to the ESXi host and run the following commands to uninstall the FDM agent:
cp /opt/vmware/uninstallers/VMware-fdm-uninstall.sh /tmp
chmod +x /tmp/VMware-fdm-uninstall.sh
Step 6: Reboot the host
Step 7: Add the ESXi host back to the cluster
Step 8: Re-add the host to the Distributed vSwitch. Go to Networking -> select the distributed vswitch. Right-click and select Manage Hosts.
Select the host
Select vnics for Uplinks to be managed by the switch
Step 9: Turn vSphere HA back on for the cluster the host resides on.
Step 10: Run the upgrade again from Update Manager and this time it will work.
ESXi upgrade preparation
With Cisco UCS you really need to make sure that your ESXi hosts are running the correct driver version. If you’re running NFS or FCoE storage into your ESXi hosts as either datastores or RDM disks then it’s critical that you have the right fnic and enic drivers. Even if you use the Cisco Custom image for ESXi upgrades the enic and fnic drivers may not be correct according to the compatibility matrix. I’ve had this issue in the past and I saw intermittent NFS datastores going offline for a Dev ESXi host and the resolution was to upgrade the enic driver which handles ethernet storage connectivity.
The best place to go is to VMware’s compatibility site for IO drivers which comes under the System/Servers. To find out which drivers you currently have you will need to check on the driver versions on the ESXi hosts. This can be done by following KB1027206. Using the values for the Vendor ID, Device ID, Sub-Vendor ID and Sub-Device ID it’s possible to pinpoint the interoperability with your respective hardware. In my case I have both VIC1340 and VIC1240 in the mix so I had to go through the process twice. Primarily you’ll be using the ‘ethtool -i’ command to find the driver version.
e.g. You can check the UCS VIC 1240 for FCoE CNAs on ESXi 5.5 Update 3 here
In this image you can see the version of enic drivers I’m running, 126.96.36.199 doesn’t match the firmware version that will be installed as part of the Cisco Custom ISO image. This shows that the enic driver version will need to be upgraded as part of the process.