I recently had to manually invoke a hot spare in a VNX 5200, but in Unisphere the option was greyed out.
On the CLI the command wasn’t supported. Now what?
According to https://support.emc.com/kb/184890 the proper command is now
naviseccli -h [ip of one SP] copytodisk [source-disk] [hot spare]
Using the “getdisk” command will show you the actual rebuild has started.
Bare in mind that the way to address disks is in the format “Bus_Enclosure_Disk”, so for example 1_2_3 means disk 3 (the 4th disk) in enclosure 2 on bus 1.
In Unisphere you can actually see the progress of the rebuild:
A while ago I talked about Hot Spares and how they are picked when a rebuild is necessary. It was almost 2 years ago and you can read it here.
Since then the rebuild / equalize technology has changed! Well, not for existing systems, but the new VNX family aka VNX2 does things a bit differently.
In the old days when a drive failed, a suitable Hot spare would kick in and the unprotected LUNs (regarding the failed drive) would be rebuild onto the Hot Spare. After a while, when the rebuild was done and the failed drive was replaced by a replacement drive, the data on the Hot spare would need to be copied to that new drive. This was called equalizing.
In the VNX2 (with MCx) this last step doesn’t exist anymore. So that means the Hot spare that was used to contain the rebuilt data is not longer a Hot Spare! It has become a regular drive! And that replacement drive will now be a new Hot Spare. When configuring a new VNX2 you’d see rules about Hot Spares and you simply don’t even need to configure Hot Spares anymore. Just make sure you have some unconfigured drives and you’re good. Your VNX2 will make sure they’re used as Hot Spares from then on.
If I remember correctly the DMX4 had a similar feature back in 2008, but it now flowed to the midrange platform as well.
How does an EMC Clariion or VNX decide which Hot Spare will be used for any failed drive?
First of all not the entire failed drive will be rebuilt, but only the LUNs that reside on the failed drive. Furthermore all LUNs on the failed drive will be rebuilt to the same Hot Spare, so a single failed drive will be replaced by a single Hot Spare. So if for example a 600GB drive fails with only 100GB worth of LUNs on it, in theory a 146GB drive could be invoked to rebuild the data. The location of the last LUN block on the failed drive specifies how large the Hot Spare needs to be. If on a 600GB drive the last block of the last LUN sits on “location 350GB”, but the amount of disk space used by all LUNs residing on that drive is 100GB, the 146 and 300GB Hot Spares aren’t valid choices, since the last block address is beyond the 300GB mark (350GB). So valid Hot Spares would be 400GB or larger.
Read more »