Update on Unitrends

A higher up at Unitrends read my article on here about Veeam vs Unitrends, and a pretty high up at that company gave me an email, asking how they could improve their product. I have to say that it was quite unexpected and sometimes I’m surprised at the readers that I get on here – I’m just some nerdy IT guy I guess, and I don’t know. I mean I realize it’s the internet, but I guess I’m not always aware of the readership I get. I just kind think this is more documenting my experience with products and stuff.UNI_Logo_RGB

Either way, I kind of told them a few of my gripes and he thanked me and let me know that some fixes were going to be in the works with them. A few of the things they highlighted to me were better error reporting in the “Failure” emails and better uniqueness criteria when scanning for VMs – this will help a LOT with cloning and replication because there won’t be the duplicate UUID error that we’ve seen in the past.

Overall, I am really appreciative of a company that is willing to listen to people – especially someone like me who works for a smaller company with a smaller environment. I work in the field and I understand that most issues are only issues when someone reports them, however, in the past when reporting issues, it seems like they just go in a queue somewhere and that’s that – you never really hear about it again or ever see anything happen with it. It’s nice to know that Unitrends listens to their customers and cares how their product performs, and even if I have had my troubles with it, it seems like they genuinely care about whether or not your system works for you and want to make improvements to it so that it works for everyone’s environment – even us with our smaller environment.

I’m not sure if I complained about this or not last time, but I had a small gripe about the CPU on our particular box being under constant load due to deduplication – at the time I thought this was a problem. I did a little power monitoring and found that the box is really only using about 320 watts (on average, as there was more CPU utilization during backup periods, which are the little spikes you see in the graph – obviously power usage went up slightly, and during the lower times it would drop) which in our environment is totally acceptable for more space. If this appliance were for something else, it might be an issue, but since it’s sole purpose is backing up and archiving data, that higher CPU utilization is not a problem to me.CPU

In the mean time, we have decided to extend our service agreement and get into a newer physical appliance as well, so we will see how that goes. I’m sure you’ll be hearing more about this in the future.

8GB FC, 16GB FC, 10GBe iSCSI or 1GBe iSCSI. Which is right for your storage area network?

This is something that I’ve had a lot of first hand experience in, and it’s something that I’ve taken quite a bit of time to look in to as well. The answer to this question is going to be the basis of this write up, and if you don’t really want to read much further, I’m just going to say this: it really depends on your environment. There are pro’s and con’s to each of these, and we’ll hit each of them.

First off, if you are considering upgrading your main storage array, chances are that you are also going to be looking at an entirely new storage networking infrastructure. The reason for that is that things are evolving pretty fast in the storage area network world. The seemingly affordable 10GB ethernet is making a dent, 16GB fibre channel has hit the market (though not strictly affordable for a business our size) and storage arrays featuring each of these are a possibility.

My first major suggestion: do some heavy monitoring of your current environment. See where your peaks are and where your low times are. See how much storage bandwidth you are currently using. Watch your disk queues and see if there are reads or writes just sitting in the pipe waiting to get served up.

Let’s talk about bottlenecks for a second. Bottlenecks can happen at three main places: the server, the switch, or the array, and most bottlenecks happen either at the network or storage array. Sometimes they can be caused by network misconfiguration or by the actual disk in the storage array not being able to keep up with how fast you are requesting reads writes. Sometimes these bottlenecks can be misinterpreted by monitoring as well.


I’ll give you a quick inside tip: getting relatively high IOPs does NOT depend on the speed of your storage area network. Of course, you do need bandwidth for sustained transfer speeds (if you are doing large reads and writes), but if your traffic is bursty and requires some relatively high IO in short bursts (SQL Server comes to mind), you don’t need a lot of bandwidth. What you need is fast response time and fast IO. Now, that being said, how do you get high IO over… let’s say 1GBe? It’s all in how the array handles your IO.

Let’s say for example that you are disk bound (this meaning that the disks in your array just can’t keep up with the reads and writes, which is a fairly common issue among all spinning drive arrays, unless you have enough spindles to keep up with it or a fat read / write cache). Basically what this means is that as your servers are pushing out writes and trying to get reads and the disks just can’t keep up with how fast you’re trying to push / pull the data – this means that your storage bandwidth is not an issue – its the actual storage array that is having trouble filling that bandwidth – though monitoring may interpret that as bandwidth lag, because you’ll see your storage bandwidth being hit kind of hard because the network is waiting to get reads and writes.

Monitoring is essential, but it’s also VERY important to know how to interpret your monitoring – you need to monitor multiple places – your network, your servers, and your storage and be able to interpret all that data. Most ethernet and fibre channel switches have SNMP on them which can be used to monitor specific ports. ESXi has many types of monitoring you can use – from things like VMTurbo to Operations Manager. Using SNMP and a graphing program like LogicMonitor or Observium you can really drill down to the port level and see which servers are using a lot of bandwidth and / or storage utilization.

When you get a nice array on the back end, you’ll be shocked at how little bandwidth it actually takes to get pretty high IO, but monitoring, when selecting a storage area network, is your best friend. You need to know you current environment and not really listen to the salesperson. The salesperson is trying to sell you something and you are trying to make the best purchase for your environment. In this case, you need to know as much about your environment as possible so that you don’t spend a ton of money on things that you will under utilize.

There does need to be happy medium though between growth potential and your current bandwidth. If you are planning on growing to more servers and more IO, you need to plan accordingly.

Deduplication vs Compression

When we were in the process of looking at storage devices (SANs), I had a question that I had to logically figure out – and that was the idea of having data deduplicated or compressed. This is something that is gaining popularity both on storage devices and on operating systems.

For anybody who is unfamiliar with compression who might be reading this, compression is basically crunching down data – in a nutshell it basically means encoding data in such a way that uses fewer bits than the original file. There are even more ways to talk about compression (lossy and lossless compression), but I would suggest looking this up as other people have talked about it quite a bit more than I have. Compression can be done two ways as well – there is compressing a file after it has been written, and then there is in-line (realtime) compression as well (which is what most storage boxes would do).

Deduplication basically eliminates / removes duplicate copies of repeating data. This usually happens on a block level. There is also inline deduplication and “post processing” deduplication. Inline deduplication happens as the data is being written – and with most storage appliances, I have found that this requires quite a bit of memory. Some claim this is 1GB RAM per 1TB of data. I think that is grossly underestimating, though it really depends on your appliance. If you think about how inline deduplication works, it basically will search the file system (or MFT / Inode) for duplicate blocks before it even writes the block, and if the block is duplicated, it will only write a reference to the full block that contains the data. This means that if you have lots of similar data, you could potentially save a LOT of space.

So on to the nitty gritty. I’ve found that deduplication is excellent for data that just sits there – in other words, it’s not being accessed with immediate frequency. Good candidates that I’ve used deduplication on are backup volumes, VDI Deployments (which I’ll get to in a moment), and other storage where you have lots of potential duplicate blocks (I suppose you could say duplicate items here too, but most dedupe is block level, meaning it dedupes duplicate blocks).

Most deduplication appliances provide inline deduplication. There are some that are post-process (such as Windows Server’s Deduplication feature, which actually works pretty well), but it really depends on your usage. Post Process requires more space because basically the data sits there until either a schedule hits for it to begin dedupe or the system to calm down enough to do the process (background dedupe). Generally, as well, background or scheduled dedupe for post process creates a lot of disk activity (thrashing) – see the image below for the overnight dedupe processing:


Compression puts a bit more stress on the processor and perhaps memory as well since the processor is what is re-encoding the data.

So… on to VDI and Deduplication – at face value it seems like a great idea. I have done it before thinking it would be a great idea, but for some reason, both on Nexenta and on Windows Server (Windows iSCSI Target with Deduplication Enabled), it absolutely brings the volumes to a crawl. At the time, I didn’t have time to do much research on it because the complaints were coming in and I was just being quick to get the on to a volume that was not deduped, so perhaps this needs a bit more research.

All in all, I’m a bit of a bigger fan of compression because I’d rather the disks not be hit so hard. Today’s processors can handle most types of compression without much problem (unless you’re using Gzip-8 or 9) and decompression of well written compression algorithms can happen at RAM speed on most multi-core systems (which can be pretty fast).

This is an image of Nexenta using LZ4 compression – it’s not being hit too hard (this particular storage box has about 15 VMs on it that are at least fairly active).


Nimble Storage – An In Depth Review – Conclusions

nimbleAll in all, after about 7 months of our cut over to our SAN, we have absolutely zero regrets. The thing is still as fast as it was on day one, despite us pushing through quite a few IOPs and pushing quite a lot of data through it. We are not the busiest shop in the world by far, but we do use our storage quite a bit for both reads and writes.

Just a little bit of info about our particular environment – We have 27 Volumes currently, and we have about 60 VMs on our Nimble CS240, several virtual and physical instances of SQL Server have their databases, logs, backups, etc… on this SAN and are being hit constantly (production databases with certain Real Estate data), we have about 14 million photos on a volume and our cache hit ratio sits right around 85%. We probably could use an upgrade to a CS240  x2 (and probably will – doubling the cache from 640GB to 1.2TB) sometime in the near future.

Our server infrastructure is HP Blades in a C3000 Chassis with HP / Cisco Branded Gbe2c Blade Switches (all four interconnects) with the uplinks in LACP configuration to a pair of stacked Dell PowerConnect switches. We followed Nimble’s Network Configuration Guidelines almost to the letter. We use Multipathing in ESXi and Windows, and it works just as expected.


These are the uplinks from the 4x Interconnect Switches on the Blade Enclosure to the Dell Switches.

Everything is using Jumbo Packets, and iSCSI is on it’s own separate subnet. The GUI is amazingly simple – I would think that even an inexperienced person could set it up and create and map volumes.


Overall, we are extremely happy with it. It was certainly was not the cheapest option out there, but it has been well, well worth it in my opinion. We replaced 18U worth of SAN equipment (not to mention the Fiber switches and Fiber infrastructure we removed) – I’ve heard of other companies replacing half racks or more of old equipment. Our power usage is down about 1200% from our previous Controllers / Enclosures / Switches.


Nimble Storage – An In Depth Review – Post 5

nimbleHardware and Network considerations: The first thing you need to know is that Nimble is 100% iSCSI based. There is no fiber channel option, it’s either 1GB ethernet or 10GB ethernet. At the time we were looking in to this, we were using Fibre Channel on our Compellent. I had always assumed that Fibre Channel was always superior to ethernet for storage because of the latencies involved.

I’ve found over time that this is not always the case. There is always a bottleneck in storage networking – whether it’s the disk array being too slow to perform all the writes requested, the network being saturated with read or write requests, the server doing the writing could be queuing disk writes, the storage device could be having a hard time keeping up with all the IO activity being requested of it, there are any number of things that could cause some slowdown.

I was very curious as to why Nimble would have (what they claimed to be and it turned out true) such a fast storage device a not offer something like 8G or 16G fiber  – to me, it felt like they were kind of shooting themselves in the foot by doing this, and on top of that, by even offering something with 4 single gigabit NICs (for a total bandwidth of 4gbp/s or about 475 megs per second). After we had the chat with the engineer we found that it’s really not all about storage bandwidth, it’s about the number of operations happening. For example, if you have 3000 operations happening in a given second and your storage device does not have the capacity to handle that, it will begin to get saturated and utilization will go up – our network utilization is never that high on the Nimble because of how fast it processes those transactions or IOs.

There a few things you REALLY need to do to take full advantage of that though, and that is multipathing. Multipathing makes a HUGE difference in performance. Nimble has a couple of “Best Practices” that make it super efficient and very fast even over 1GB ethernet. If you use 10GB ethernet, I don’t think you’re going to have to worry about network bottlenecking. There is also a really handy Connection Manager coming up with the GA of Nimble OS 2.0.

Here is what I would suggest – you need a good switch. Whether 1GB or 10GB you need a good switch (probably L2 at the least – if you just get a dumb switch, you’re not going to get the performance you want) – something newer that has support for Jumbo Frames, Unicast Storm Control, STP / RSTP, Buffered Ports, etc… If you want good performance, I would highly suggest getting a nice switch. Netgear has some pretty good cheap ones, Dell has some decent ones, HP has some good ones, and of course, Cisco has some good ones. Don’t cheap out on the switch.

Nimble Storage – An In Depth Review – Post 4


Disk structure and CASL: The system itself is set up in a somewhat different way – all spinning disks are in RAID 6 (Dual Parity) with one spare (meaning you could technically lose up to 2 disks before you really had to start sweating bullets). The SSDs are not in any kind of array and that bears repeating – they not in any type of array. Putting anything in any type of protected RAID Array (meaning anything other than a RAID 0) means that you start to lose capacity. Each SSD is individually used to the fullest of it’s capacity for nothing but cached reads. So… the question arises “what if your SSD dies?”. Well… not much to be honest – the space that was used for cache is lost, and the data that was in that cache was lost, but it’s a read only cache – meaning that the data is already on the spinning disk, it’s not longer in cache. The data that is deemed worthy of cache will be re-pushed to cache as its worthiness is noted.

So, on to the data write operations. I made up a little flow chart that shows you just how the data is written to disk and in what process it is written in.
Nimble-Write-PathA few other little tid bits on writing data – compression is decent. I personally think LZ4 is a decent compression algorithm – in the past I’ve used several in SAN infrastructure, mainly ones like LZJB (pretty close to LZ4), the standard GZIP levels, and LZMA (once again, pretty close to LZ4). LZ4 strikes a pretty good balance between compression and CPU usage – obviously the higher compression, the more CPU usage it requires to compress data. Using things like Nexenta, I prefer the GZIPs over the LZJB – mainly because the CPU usage is not out of line (except when you use things like GZIP 8+, then it starts getting a little high).

According some Nimble Documentation, compression speeds happen at about 300 MB/s per CPU core. LZ4 (and most Lempel-Ziv based compression) has awesome decompression – decompression can happen at RAM speed in most cases (1 GB/s+). Compression is optional, though it is enabled by default. We have shut it off on several of volumes (namely the Veeam Volume, since Veeam compresses so well already that we weren’t gaining anything by having it on, and our Media server, since photos are already compressed, it doesn’t offer much in the way of compression). For things like VMs and SQL Server, I think you can expect to see a 1.5 – 2.5x compression ratio – SQL Server especially compresses well (2.25x+).

You’d think with compression, that you may notice a performance hit while compressing and decompressing data like that, be we have not noticed any type of performance hit, only huge improvements.

Next are the read operations – once again, a little flow chart:

Nimble-Read-PathSo, let’s talk about cache for a second (accelerated reads). Any “hot” data is written to SSD Cache. The system serves that hot data from Cache and it obviously responds really quickly to changes. Obviously reads are a lot faster coming from SSD than HDD (RAID 6 reads aren’t horrible, especially when coming sequentially, but SSD is still generally a lot faster, in the realm of about 6 – 10 milliseconds on HDD vs 200 microseconds on SSD (about .2 milliseconds).

I may have previously mentioned that these use standard MLC SSDs (multi layer cell solid state). Generally speaking, in the enterprise, the “best” SSDs go in this order. SLC > eMLC > MLC > TLC, and TLC being the worst by quite a long shot. SLC’s, generally speaking, can endure more reads and writes than their MLC counterparts. If you want more info on why that is, read this. Honestly, when the engineer told us it was just plain MLC, I was like… wow, this product seems so genius and then throw MLC in there… what for? I asked that question while he was in there, and he gave me some pretty good answers – and the main one is that it all comes down to how you use it. SLC SSDs will work very well and endure random writes (by a lot, we are talking 100,000+ writes per cell). Since MLC drives have 2 bits written per cell, the cells can potentially wear out faster and since Nimble uses MLC, it converts random writes in to sequential writes, which minimizes “write amplification“, or basically spammy writes to cells, which thereby prolongs the life the SSD. Data on the SSDs are compressed and are “indexed” with metadata which also increases cached reads.

Next post, I’ll get in to networking and such.

Nimble Storage – An In Depth Review – Post 3


The Decision: We kept going back to Nimble for everything as a comparison. We would be like “Oh well we really love this about Tegile… but Nimble does it this way”, or “Well, we really like how EMC does this, but Nimble does it this way…”. Eventually it got to the point of being something that we really needed to do.

And now…  why we chose Nimble: Here it goes. I’ll do my best to write what I remember and what I have seen. When we first started looking around for a new SAN, a friend came to mind – a friend that previously worked a different SAN company. I called him for some unbiased advice, mainly wanting to know if he had an opinion on what was better – 10GB iSCSI or 8GB Fibre Channel. He suggested that we do a little meeting and have a chat. Previous to this, I had only heard bits about Nimble storage. He came in and basically told us that both Fibre and iSCSI have their application in a particular environment, but then went on to his sales pitch. My friend is not a sales guy, he is a SAN Engineer. He didn’t sell the product to us, the product sold the product to us.

Basically he told showed us the entire SAN from the chassis all the way down to the file level functions.

So… first things first. The Nimble CS series are actually stashed in a slightly modified Supermicro 6036ST-6LR chassis. The enclosure has 16 disk bays (4x SSD, 12x HDD). The solid state drives are in the middle of the enclosure. Why did they choose that? One thing you’re going to learn here is that there is a reason for everything Nimble does. They put the SSDs in the middle of the enclosure because the standard drives could theoretically create vibration. Vibration is no good for a hard drive, and therefore having the drives closer to the screws on the side of the rack mounts could provide more stability and support for those drives. I suppose in theory, it would help with drive vibration. Obviously I have never field tested this, but to me it seems like a valid claim.


Look familiar? The Nexsan NST series and Tegile Zebi series are both housed in the same chassis. I’m not sure how they have their SSDs positioned, when we met with them, they did not make a point to tell us, nor when we saw their physical products, did they show us.

Anyway, the chassis has two controllers that are hot-swappable in the back in an Active / Passive configuration. Most SANs are Active / Active, and for those of you who are uncomfortable with this, in a nutshell, here is how the controller works. As data is written, it is first written to NVRAM, which is mirrored across controllers – these means that if Controller A drops out in the middle of a write, Controller B already has the exact same data in NVRAM and therefore nothing is lost. The failovers are absolutely seamless. We have done several test failovers and there is absolutely 0 interruption. The controllers are housed in the back of the unit and slide out. These can be upgraded at any time should your infrastructure need it (IE, you can go from a CS240 to a CS440 just by swapping controllers). This can be done hot.

The CASL architecture: Before going on to another post and going in to this in depth, if you have time, I would suggest watching this video:

You might think it’s all just marketing propaganda, but it seriously is revolutionary.

Nimble Storage – An In Depth Review – Post 2


Nexsan: The Nexsan array we looked at (it is actually housed in the same chassis as the Nimble array, so the look and configuration are quite simple – though there is a really large part missing, CASL, which I’ll get to in post 3). It’s a Hybrid array (meaning it leverages standard spindle disks and SSDs as cache). This is very similar to Tegile except there is user defined RAID levels (only 5 and 6), and it’s quite accommodating. You can choose how many SSDs you want as cache and you can choose how many standard disks you want.

The good: It’s extremely flexible. Multi-protocol (CIFS, NFS, iSCSI, FC).

The bad: The caching seems extremely similar to the Tegile. Data is cached as it is read or written – this provides for fast reads and write at the expense of the lifetime of the SSD. Because it’s user accommodating, I would think that most people would elect for standard MLC SSDs rather than eMLC SSDs and they will most likely run in to problems because of that (especially in write intensive environments). No compression or deduplication to speak of.

Dell (Compellent): We looked at upgrading our controllers. This provided us a way to keep our existing disks and infrastructure, only with newer controllers.

The good: We wouldn’t have had to rip out the entire infrastructure.

The bad: Price. The price is a massive con. It’s so massive that just the price of the controllers was 3/4 of the way to an all new SAN. No SSD Caching to speak of. Tiered storage uses tons of I/O for reads and writes for data progression. Licensing costs. Power usage is off the hook (right now our old Compellent array is estimated to be using between 2500 – 3000 watts per hour. That’s 2x Controllers at 500 watts a piece and 3x disk enclosures at 500 watts a piece, and this is literally not a joke, we monitor our power usage). Disk failures are rampant due the drives being thrashed for data reads and writes.

Nimbus Data: Nimbus is a full out all Flash based storage built on SSDs. It’s insanely fast.

The good: Crazy high I/O.

The bad: Price is totally out of range and this is absolute overkill for what we need. We are pushing about 3500 – 5000 bursted IOPs. The Nimbus array is pushing 1 million IOPS. I will say this as well – I tend to despise companies that hire the really gorgeous girls to sell you products (who really don’t know jack), and this is exactly what Nimbus does. I know this blog is called Dorkfolio and that I am a bit of a nerd, but just because I am a nerd and some gorgeous girl tries to sell me something doesn’t mean I’ll buy it. Nimbus employs some the most attractive girls I’ve ever seen at a SAN booth, and it’s all to lure in the nerds. I buy the product based on whether the product sells itself to me, not based on whether you have an extremely attractive girl trying to sell it to me.

We though about building our SAN based on Nexenta: Nexenta is a SAN OS built on Solaris for ZFS. It’s a pretty robust SAN solution. It takes a bit of work to set up, but once it’s set up, it would work… or I thought it would.

The good: We could build it for a lot cheaper than we could buy it from a SAN vendor for. RAID-Z.

The bad: No support other than the person who designed and built it. Reporting is sketchy at best (IE, if a drive fails, you’re not going to know about it immediately). SSD Cache is used as a fast spindle disk (once again, like Tegile). ZFS is 95% stable. In a production environment, 95% is not good enough.


Yes, this is really replacing 12U of Compellent Equipment.

Nimble Storage – An In Depth Review – Post 1


Recently I have had the pleasure of working with a Nimble CS240 Array. Needless to say, I am quite pleased with it and I’m going to do a pretty thorough write up here on it here for those on the fence looking for a new SAN or looking to do a “rip and replace”. I’m going to go in depth at everything we went through to make this decision, why we made it, what we did, and most importantly, how it is working for us.

1. History

It’s a bit of a long story, but it is coming time to upgrade our SQL Server from 2008 R2 to SQL Server 2012. We have (had) in our production environment, a Compellent SC30. The Compellent is a tiered storage device that relies on pure spindle number and drive speed to push out it’s performance. There can be several tiers that are utilized for different scenarios – for example, Tier 1 is generally 10k or 15k SAS drives in RAID 10, Tier 3 would be something more along the lines of 7200 RPM SATA drives in RAID 5. The bread and butter are in that you can perform reads from the Tier 3 and writes to the Tier 1 and then at night (preferably), data progression can move the newly written data to Tier 3 for reading. Generally speaking this seems like a pretty good system except for one major flaw:  Disk thrashing. You see suddenly about 8 months ago, we had several (and by several I mean about 5x of our 36) drives fail. Tier 1, Tier 3, both were failing in very short proximity to each other – almost to the point of sweating bullets. Thankfully, Dell (who owns Compellent now) has very good support for these and we have our drives the next day.


In any event, this SAN used all these drives for random reads and random writes, which means that when data is written to the drives, it is written wherever the controller can find some free space to put that sector. This SAN was extremely multipurpose. It housed SQL Server databases, VMware Virtual Machines, Physical Servers connected to LUNs, etc… All these servers are performing reads and writes against the controller, and since the data is randomly written, it’s randomly read as well.

The image on the left will show you data “fragmentation”, or rather, what fragmentation would look like on a physical drive platter – so in order to read File A, the spindle head would have to go from the inner most part of the of the drive, out to the outermost part, back in, etc… and because of that, those hard drives are working intensely hard just to read and write data.

In any event, with a sour taste in our mouth coming from drives dying, we atempted to create a SQL Server 2012 box with a LUN. Needless to say the SC30 was too old to accomidate this due to it’s age and 32 bit disk structure. So… time to upgrade controllers – but they were very expensive.

So, we started looking around at some newer technology.

2. Application Usage

We are about 95% virtual and about 5% physical (our only physical servers are SQL Server boxes), however, I would think that SQL Server accounts for at least 50% of our I/O (and it’s probably 75% read, 25% write on that). We utilize VMware ESXi 5.1 (at the moment), SQL Server 2008 R2 (soon to be 2012), and some media volumes for images. We needed at least 7 TB of usable capacity and were expecting to grow by at least another 4 TB in the very near future.

3. Research

We met with a lot of vendors – companies like Tegile, Nexsan, Dell, Nimbus (insanely expensive for a full SSD SAN, but hey, it was fun), and a few other companies. Let me lay out the major contenders and why they were contenders.

Tegile: We really liked Tegile and had a good long meeting with them. It seemed like their big thing was SSD Cache, Compression, and Deduplication, all of which seemed attractive to us because we needed the read performance (SQL anybody?) and because deduplication and compression could potentially save us a lot of a space. It utilizes 2x onboard controllers in Active / Active configuration.

The good: Variable compression (gzip, lzjb), deduplication, SSD Cache, price. I’m not sure if I’m allowed to post price on here, but I’ll say, with a lot of generality that the quote come in around 40,000$ for a 22 TB array.

The bad: They basically treat their SSDs like regular hard drives, more or less following the procedure below:

Tegile Read and Writes

Tegile Read and Writes

As you can see, there are a lot of writes to the SSD – this is alright because they utilize eMLC based Solid State Drives, but still, it utilizes the SSDs as “fast spindle cache”. Quite frankly, they are not fast spindle cache – they are SSDs. After a lot of thought and research, we decided against this due to this basically being the same system we were leaving (drive thrashing) and because of how we felt about all those writes to SSD cache.