The Windows 8.1 Upgrade Experience (for Enterprise)

I have to admit, I’ve tried to be on Microsoft’s side through this whole Windows 8 thing. I’ve been using it for quite a while, and though there are some things I don’t particularly care for, I really don’t find it to be that horrifyingly bad of an OS. Now, that being said, the upgrade process to Windows 8.1 for an Enterprise user (or even a MAK / KMS key) is pathetic.

First off, Windows 8.1 was presented as a “free” upgrade to Windows 8. Well, free it is, but performing the actual upgrade on those is not as easy as just doing a Windows Update. First off, Microsoft says that the update is available in the “Windows Store”… unless of course you are using Windows 8 Enterprise or if you activated with a MAK or KMS key. So… after waiting for it to show up in the store, which it never did, I went on the prowl to find out where the update was.

Come to find out what I just mentioned. Now the only way to do the upgrade is an “in place upgrade” by using the ISO. I checked the VLSC for an ISO and was surprised to find out that it’s not there (at least not 8.1 Enterprise). So much for that problem free upgrade.

I finally found an ISO for doing the upgrade and completed it – however, now the system needs a new product key. This is total fail in my opinion and a pathetic release cycle.

 

Nimble Storage – An In Depth Review – Post 3

nimble

The Decision: We kept going back to Nimble for everything as a comparison. We would be like “Oh well we really love this about Tegile… but Nimble does it this way”, or “Well, we really like how EMC does this, but Nimble does it this way…”. Eventually it got to the point of being something that we really needed to do.

And now…  why we chose Nimble: Here it goes. I’ll do my best to write what I remember and what I have seen. When we first started looking around for a new SAN, a friend came to mind – a friend that previously worked a different SAN company. I called him for some unbiased advice, mainly wanting to know if he had an opinion on what was better – 10GB iSCSI or 8GB Fibre Channel. He suggested that we do a little meeting and have a chat. Previous to this, I had only heard bits about Nimble storage. He came in and basically told us that both Fibre and iSCSI have their application in a particular environment, but then went on to his sales pitch. My friend is not a sales guy, he is a SAN Engineer. He didn’t sell the product to us, the product sold the product to us.

Basically he told showed us the entire SAN from the chassis all the way down to the file level functions.

So… first things first. The Nimble CS series are actually stashed in a slightly modified Supermicro 6036ST-6LR chassis. The enclosure has 16 disk bays (4x SSD, 12x HDD). The solid state drives are in the middle of the enclosure. Why did they choose that? One thing you’re going to learn here is that there is a reason for everything Nimble does. They put the SSDs in the middle of the enclosure because the standard drives could theoretically create vibration. Vibration is no good for a hard drive, and therefore having the drives closer to the screws on the side of the rack mounts could provide more stability and support for those drives. I suppose in theory, it would help with drive vibration. Obviously I have never field tested this, but to me it seems like a valid claim.

SuperMicro

Look familiar? The Nexsan NST series and Tegile Zebi series are both housed in the same chassis. I’m not sure how they have their SSDs positioned, when we met with them, they did not make a point to tell us, nor when we saw their physical products, did they show us.

Anyway, the chassis has two controllers that are hot-swappable in the back in an Active / Passive configuration. Most SANs are Active / Active, and for those of you who are uncomfortable with this, in a nutshell, here is how the controller works. As data is written, it is first written to NVRAM, which is mirrored across controllers – these means that if Controller A drops out in the middle of a write, Controller B already has the exact same data in NVRAM and therefore nothing is lost. The failovers are absolutely seamless. We have done several test failovers and there is absolutely 0 interruption. The controllers are housed in the back of the unit and slide out. These can be upgraded at any time should your infrastructure need it (IE, you can go from a CS240 to a CS440 just by swapping controllers). This can be done hot.

The CASL architecture: Before going on to another post and going in to this in depth, if you have time, I would suggest watching this video:

You might think it’s all just marketing propaganda, but it seriously is revolutionary.

Nimble Storage – An In Depth Review – Post 2

nimble

Nexsan: The Nexsan array we looked at (it is actually housed in the same chassis as the Nimble array, so the look and configuration are quite simple – though there is a really large part missing, CASL, which I’ll get to in post 3). It’s a Hybrid array (meaning it leverages standard spindle disks and SSDs as cache). This is very similar to Tegile except there is user defined RAID levels (only 5 and 6), and it’s quite accommodating. You can choose how many SSDs you want as cache and you can choose how many standard disks you want.

The good: It’s extremely flexible. Multi-protocol (CIFS, NFS, iSCSI, FC).

The bad: The caching seems extremely similar to the Tegile. Data is cached as it is read or written – this provides for fast reads and write at the expense of the lifetime of the SSD. Because it’s user accommodating, I would think that most people would elect for standard MLC SSDs rather than eMLC SSDs and they will most likely run in to problems because of that (especially in write intensive environments). No compression or deduplication to speak of.

Dell (Compellent): We looked at upgrading our controllers. This provided us a way to keep our existing disks and infrastructure, only with newer controllers.

The good: We wouldn’t have had to rip out the entire infrastructure.

The bad: Price. The price is a massive con. It’s so massive that just the price of the controllers was 3/4 of the way to an all new SAN. No SSD Caching to speak of. Tiered storage uses tons of I/O for reads and writes for data progression. Licensing costs. Power usage is off the hook (right now our old Compellent array is estimated to be using between 2500 – 3000 watts per hour. That’s 2x Controllers at 500 watts a piece and 3x disk enclosures at 500 watts a piece, and this is literally not a joke, we monitor our power usage). Disk failures are rampant due the drives being thrashed for data reads and writes.

Nimbus Data: Nimbus is a full out all Flash based storage built on SSDs. It’s insanely fast.

The good: Crazy high I/O.

The bad: Price is totally out of range and this is absolute overkill for what we need. We are pushing about 3500 – 5000 bursted IOPs. The Nimbus array is pushing 1 million IOPS. I will say this as well – I tend to despise companies that hire the really gorgeous girls to sell you products (who really don’t know jack), and this is exactly what Nimbus does. I know this blog is called Dorkfolio and that I am a bit of a nerd, but just because I am a nerd and some gorgeous girl tries to sell me something doesn’t mean I’ll buy it. Nimbus employs some the most attractive girls I’ve ever seen at a SAN booth, and it’s all to lure in the nerds. I buy the product based on whether the product sells itself to me, not based on whether you have an extremely attractive girl trying to sell it to me.

We though about building our SAN based on Nexenta: Nexenta is a SAN OS built on Solaris for ZFS. It’s a pretty robust SAN solution. It takes a bit of work to set up, but once it’s set up, it would work… or I thought it would.

The good: We could build it for a lot cheaper than we could buy it from a SAN vendor for. RAID-Z.

The bad: No support other than the person who designed and built it. Reporting is sketchy at best (IE, if a drive fails, you’re not going to know about it immediately). SSD Cache is used as a fast spindle disk (once again, like Tegile). ZFS is 95% stable. In a production environment, 95% is not good enough.

Nimble-CS-size

Yes, this is really replacing 12U of Compellent Equipment.

Nimble Storage – An In Depth Review – Post 1

nimble

Recently I have had the pleasure of working with a Nimble CS240 Array. Needless to say, I am quite pleased with it and I’m going to do a pretty thorough write up here on it here for those on the fence looking for a new SAN or looking to do a “rip and replace”. I’m going to go in depth at everything we went through to make this decision, why we made it, what we did, and most importantly, how it is working for us.

1. History

It’s a bit of a long story, but it is coming time to upgrade our SQL Server from 2008 R2 to SQL Server 2012. We have (had) in our production environment, a Compellent SC30. The Compellent is a tiered storage device that relies on pure spindle number and drive speed to push out it’s performance. There can be several tiers that are utilized for different scenarios – for example, Tier 1 is generally 10k or 15k SAS drives in RAID 10, Tier 3 would be something more along the lines of 7200 RPM SATA drives in RAID 5. The bread and butter are in that you can perform reads from the Tier 3 and writes to the Tier 1 and then at night (preferably), data progression can move the newly written data to Tier 3 for reading. Generally speaking this seems like a pretty good system except for one major flaw:  Disk thrashing. You see suddenly about 8 months ago, we had several (and by several I mean about 5x of our 36) drives fail. Tier 1, Tier 3, both were failing in very short proximity to each other – almost to the point of sweating bullets. Thankfully, Dell (who owns Compellent now) has very good support for these and we have our drives the next day.

harddiskB

In any event, this SAN used all these drives for random reads and random writes, which means that when data is written to the drives, it is written wherever the controller can find some free space to put that sector. This SAN was extremely multipurpose. It housed SQL Server databases, VMware Virtual Machines, Physical Servers connected to LUNs, etc… All these servers are performing reads and writes against the controller, and since the data is randomly written, it’s randomly read as well.

The image on the left will show you data “fragmentation”, or rather, what fragmentation would look like on a physical drive platter – so in order to read File A, the spindle head would have to go from the inner most part of the of the drive, out to the outermost part, back in, etc… and because of that, those hard drives are working intensely hard just to read and write data.

In any event, with a sour taste in our mouth coming from drives dying, we atempted to create a SQL Server 2012 box with a LUN. Needless to say the SC30 was too old to accomidate this due to it’s age and 32 bit disk structure. So… time to upgrade controllers – but they were very expensive.

So, we started looking around at some newer technology.

2. Application Usage

We are about 95% virtual and about 5% physical (our only physical servers are SQL Server boxes), however, I would think that SQL Server accounts for at least 50% of our I/O (and it’s probably 75% read, 25% write on that). We utilize VMware ESXi 5.1 (at the moment), SQL Server 2008 R2 (soon to be 2012), and some media volumes for images. We needed at least 7 TB of usable capacity and were expecting to grow by at least another 4 TB in the very near future.

3. Research

We met with a lot of vendors – companies like Tegile, Nexsan, Dell, Nimbus (insanely expensive for a full SSD SAN, but hey, it was fun), and a few other companies. Let me lay out the major contenders and why they were contenders.

Tegile: We really liked Tegile and had a good long meeting with them. It seemed like their big thing was SSD Cache, Compression, and Deduplication, all of which seemed attractive to us because we needed the read performance (SQL anybody?) and because deduplication and compression could potentially save us a lot of a space. It utilizes 2x onboard controllers in Active / Active configuration.

The good: Variable compression (gzip, lzjb), deduplication, SSD Cache, price. I’m not sure if I’m allowed to post price on here, but I’ll say, with a lot of generality that the quote come in around 40,000$ for a 22 TB array.

The bad: They basically treat their SSDs like regular hard drives, more or less following the procedure below:

Tegile Read and Writes

Tegile Read and Writes

As you can see, there are a lot of writes to the SSD – this is alright because they utilize eMLC based Solid State Drives, but still, it utilizes the SSDs as “fast spindle cache”. Quite frankly, they are not fast spindle cache – they are SSDs. After a lot of thought and research, we decided against this due to this basically being the same system we were leaving (drive thrashing) and because of how we felt about all those writes to SSD cache.

Hyper-V Virtual Machine Slow Network Transfers

I found a strange issue today affecting HyperV when you are using a Broadcom NIC as the vSwitch NIC – and that is that transfers from within the VM to another part of the network – those transfers are crazy slow.

This is a “bug” in the Broadcom driver and has to do with a network feature on Broadcom NICs.

To fix it, just disable “Virtual Machine Queues” in the Advanced Configuration dialog – this is for the dedicated NIC and NOT vEthernet.

Broadcom

Hyper-V… 2 Months Later

I have to say this so far… I will not be giving any of my influence to my immediate boss or other co-workers (or even anyone on the interwebs) to use Hyper-V 2012, or any other version in a production environment for a few reasons:

1. Linux Guests

Obviously I am a Linux enthusiast. I don’t believe that it is right for everyone, but I do believe in it, and I do believe that it is a very solid server platform. That being said, Microsoft has not accepted this fact quite yet. Integration with a Linux guest under Hyper-V is flaky at best. I have had more kernel panics using stock CentOS than I ever have on any other hypervisor, or even physical machines. This is with their integration software installed as well. I do not use Debian / Ubuntu as production machines, so I can’t speak for those, but Red Hat / Oracle / CentOS may be “officially supported”, but they do not work nearly up to snuff. I honestly have been afraid to update some of the servers I put on our Hyper-V server for that reason.

2. Cannot hot-add hardware

And by hardware, I don’t mean hot adding a CPU or a gig of memory… In VMware this is a given – sometimes you need to hot add a NIC or a hard drive. This is not the case in Hyper-V – and if it is it is buried somewhere deep in there.

3.  Runs on top of Windows

This could be a pro or a con.

Anyway, there are far more things to mention, but the Linux Kernel Panics are happening far too often for me to endorse this as an acceptable alternative for VMware – you might as well just run Virtual Box on the machine and share the machines through RDP – I think that would be a more stable solution than Hyper-V.

Pardon the Hiatus

I am sorry it has been so long since the last post and the last kernel update – I’ll get back on it. I’ve been extremely busy with school and work lately. I have a few posts and topics in my head though and I’ll get on them quite soon. A lot has been happening around here and I’m really excited to share it all.

Hyper-V 2012 and my thoughts so far…

The company I work for is thinking about moving from VMware to Hyper-V for several reasons (costs being one of them). Let me first preface this with our equipment – we have 3 datacenters with ESXi clusters at each (a Production datacenter, a DR / Backup datacenter, and a “datacenter” here at our office).  At DR, we have 2x HP BL460 G1 (G5) (32 gigs of RAM each) blades that run Clustered ESXi 5.1, Production has 5x BL460 G7s (each with 98 gigs of RAM), and here we run older DL380 G7s / G5s that  vary with either 32 or 64 gigs of RAM.

In any event, you can see the costs with VMware hitting a pretty high point here – each Blade has 2 CPUs. Do the math – it’s a ton of money. In any event, we’ve started messing with Hyper-V. I attended a conference last summer in which they highlighted some of the features of it, and on top of that, it comes with Windows Server 2012. HyperV

I personally have had some time to play with this quite a bit over the last couple of days and though I am pleasantly surprised at how far Windows has come regarding virtualization, I don’t think that it’s quite ready for the prime time yet.

I have a few reasons as to why I say this – and I’ll go through them in depth. But my first gripe so far is this – usability. There is very little documentation on troubleshooting issues that you might run in to (and believe me, you’ll run in to them). Getting a single Hyper-V Server up and running was the easy part. Remote Administration was not so easy getting set up (in that if you are on different domains, server on a domain, remote console on a workgroup, etc…).

Second – the System Center 2012 Virtual Machine Manager is impressive, but hard to use. Every time I try something new, there is some sort of error, it’s never smooth sailing. I’m not saying it isn’t that way with VMware, but Microsoft sort of has a reputation for that kind of thing, and this delivers.

The cost is the main thing though – (see here for a pretty in depth comparison: http://www.milesconsultingcorp.com/Hyper-V-versus-VMware-Comparison.aspx). It’s MUCH cheaper than VMware – enough that it might be worth the trouble of getting set up.

Anyway, expect some postings to some issues I’ve had that you might run in to.