Accelerating your storage array by using SSD technology

It’s out there since quite a few years already. It started becoming available to the general public about 12 years ago or so and was commonly seen in digital cameras: FLASH storage! At first the devices couldn’t store more than just a few MB and prices were high, but over time the size went up and prices went down and the first SSD drives (should we say “drives”?) were born. Still expensive but they were very usable in the computer industry. Mainly heavily used databases could be accelerated by using SSD because there was no rotational latency and avg access latency was in the sub mili second range instead of multiple mili seconds! The common problem in the last few years was mainly durability, but currently the SSD technology is just as reliable as the old rotating disks.


And then there’s these 2 different technologies: SLC and MLC. What’s all that about? In short – and I hope you can follow this short explanation – an SLC device stores a single bit in a single cell and an MLC device stores multiple bits in a single cell. The disadvantage of MLC over SLC is that for writing 1 bit in an MLC cell the whole cell should be reread and rewritten, which of course causes more wear than writes in an SLC device. The big advantage of MLC over SLC is that MLCs can store more data on less surface and they’re way cheaper than the SLCs. An MLC device requires something like the TRIM command to keep an MLC which is in use to be fast, where an SLC doesn’t need this “trick”.

In the storage array technology only SLCs are being used at the moment, so that explains why these 100 or 200GB SSD “disks” are so expensive (and more reliable).

But how to use SSD technology in these SAN attached storage arrays? The first implementations, about three years ago used SSD devices as a replacement for spinning disks, making a 1TB database on SSD a terrible expensive choice just to simply increase the number of IOps (Input/Output operations per second) on this database. Since little more than a year ago EMC announced the use of SSD as cache and many people are considering it’s use, since prices went down, and the storage array software was improved so for a 1TB database with a 50GB hot spot, you’d only need to cache this 50GB hot spot to gain the same performance!!
The use of SSD as a cache has the advantage that it doesn’t matter which block in the array is heavily used, once the LUN it resides on is EMC FAST Cache enabled that hot spot block can be cached on SSD.
I’ve gathered some real life data from an EMC CX4-240 Clariion with FAST Cache enabled and I’d like to share how I came to the conclusion that FAST Cache is here to stay. After you’ve seen the graphs my guess is that you want to buy SSD as well, I’m convinced!
First of all I gathered a week’s worth of NAR files. NARs are created when you turn on Analyzer Data Logging. I choose to use a 1 minute interval and I choose to store 7 days of archives. In Unisphere go to “monitoring”, “analyzer” and then “data logging”.
After waiting until your array gathered enough data to analyze, you can start to retrieve the NAR files from your array to a local drive. The command I use is

  • naviseccli -h <ip-address> analyzer -archive -path z:\archive -all -o

Now you have a lot of NAR files sitting in the z:\archive folder. Check if you have the right files and remove any unwanted files. Each file has a timestamp in it’s name, so they’re easy to spot. Furthermore move all SPB files to a different folder (named SPB) as the SPA files and place the SPA files in a folder named SPA. Once you have only the SPA files left (or SPB, they should contain roughly the same data) you need to merge them into 1 large NAR file, since you need to load this large NAR file into analyzer later and 1 files handles better than douzens of small ones. The command I use to merge the files is this:

  • naviseccli analyzer -archivemerge -data <file1> <file2> -out fileM1 -overwrite n
  • naviseccli analyzer -archivemerge -data <fileM1> <file3> -out fileM2 -overwrite n
  • naviseccli analyzer -archivemerge -data <fileM2> <file4> -out fileM3 -overwrite n
  • and so on

Repeat this step for each small NAR file that you gathered and make sure <fileMx> (output file) is the first input file in each following command, so in the end the output grows with each small input file you add to the output file.
After the last command has run, you’re left with a large NAR file containing all data from all small NAR files.

Now in Unisphere go to “monitoring”, “analyzer” again and choose “customize charts” and select the “advanced” box (general tab) as well as the “performance detail” box (archive tab). Make sure the “initially check all tree objects” is unchecked.
Now choose “open archive” and select the big NAR you just created and after it’s loaded choose the date and times that you want to evaluate (I’d use the whole file) and press “ok”.
Now there’s a trick we’re going to use to be able to combine 2 graphs into one: make sure you have MS Excel nearby! First we need to see how much I/O is being handled by the array: choose both SPs in the SP tab (and make sure nothing else is checked) and select “write throughput (IO/s)”. Click on the little clipboard icon on top of this screen (this will copy the data to the clipboard, not the graph image) and start Excel. Now paste the clipboard into an empty sheet and make sure the timestamps are in the columns, not the rows. In each column add a SUM of the SPA data and SPB data. You might need to spread out the data over multiple columns instead of keeping all data in 1 column.
Now back in analyzer deselect the choices you’ve just made (SPA/B and write throughput) and in the “storage pool” tab select all LUNs as well as the “FAST Cache write hits/s”. Click on the little clipboard icon on top of this screen again. Now paste this clipboard into a new empty sheet and again make sure the timestamps are in the columns, not the rows. In each column add a SUM of all the cells containing LUN data.
Copy both rows containing the SUM data we just created into a separate line and select both lines and create a graph. Remember this graph represents the written data!!

So you need to repeat this step for the read data.

Blue is the total amount of IOps the array is handling, red is what the SSDs are handling, the first graph shows the writes, the second graph shows the reads:

 

If you want to know exactly how effective SSD is performing, you could calculate the exact percentage of SSD IOps compared to the total amount of both SPA and SPB and graph those 2 together in a new graph. Again, for writes and reads separately, the first graph being the writes and the second being the reads:

As you can see the writes are being absorbed very good and the reads are quite a bit behind on that figure. I must admit that I’m not stessing the SDDs that far just yet (I’ve only enabled FAST Cache for about 15 or 20 LUNs), but the general idea should be clear.

About 40% of the writes in my real life config is being handled by SSD where the remaining 60% is actually going to spinning disks. This means the load on these spinning disks has gone down and the system can be hammered harder, so you can expect more performance out of it. Depending on your needs you should consider buying some SSDs to get just that extra performance. For reads the revenue is less, since the chance of true random reads being performed by SSD is quite small, which is most likely far smaller than the capacity of the spinning disks.

  1. 😀 Cool post. I’ll be following your blog with interest.

  2. Nice to see your first post Rob… finally 😉

  3. Jo Verstappen

    Rob, very nice one.

  4. good to see your Blog, and feel worth reading.
    The Questions came to mind immediately are what is the Random versus sequential IO ratio typically you see from this array. Also small size sequential IO get promoted fast cache and those LUNs which has this IO pattern are candidates to keep FAST cache disabled( EMC best practice?).But how much this factor is applicable to you test..?

    • SKT,

      sorry for the very late response, but I promise to follow the messages more carefully from now on.

      The test I ran was a real life implementation. So I the IO was random small block. Candidates for disabling FAST Cache are really small LUNs or LUNs that have little or no activity. On the other hand, you could argue about both because the FAST Cache would not cost that much resources and both type of LUNs will benefit from it anyway. Performancxe is a tricky subject.
      Nowadays I play with it to see the effects and if there’s not much effect, I disable it again to free FAST Cache resources for other LUNs that might be in more need for speeding up IO.

Would you like to comment on this post?