Is it just me, or is there still a lot of taboo and misinformation in our industry about VMware, and what it brings to the table, and what it's capable of? As big of keywords as 'virtualization' and 'consolidation' have become, I still get the eerie doubt from a lot of peers about just that. The minute you say, "Hey, why don't we just VM it!" there's a hush that falls over the room as everyone starts throwing random reasons out why they think it wouldn't work. And it really comes down to a lack of education (adoption might be a better word..."the norm" so to speak) throughout the industry. You sales folks have GOT to stop feeding us high-level powerpoint slides with these monotone readings over the top of them. Put the slides away and just TALK SHOP WITH US!
As much as VMware and other big players have done for the industry, especially in the last few years, the sales force is still the same as it was10-20 years ago. You're still just using keywords to get our attention, but not really telling us how to use your product to our advantage. Even when we've had vendors and resellers come INTO our office to give 'demonstrations,' it's still nothing more than them plugging their laptop into our big tv, and showing us the same powerpoint they've shown to 100 other potentials. Eventually, the techie's get to take over, and the lonely PSE they let out of the closet for a day to come along finally gets to converse with the company's IT team and we get down to business, if we're not nodding off from watching the slides click by while the droning of the sales rep reading them verbatim seems more like a lullaby.
But that's not what I'm writing about here. This isn't a rant. This was a precursor into a battle I didn't know I was getting myself into.
A little backstory...
Our current environment involves a couple of Sun 18U refrigerators with a fiber-channel Hitachi brick. Pretty cookie-cutter for a high-end, mission-critical Oracle database. Layer on a bunch of Oracle database, application, and DR software as well as all the manual processes that go along with them, and you've got a rough idea of what we're starting with.
Before this upgrade process started, we had finally gone down the consolidated storage path, joined the NetApp country club (as my boss likes to call it), and were using it for mainly shared network drives, user data, and VMware. The NetApp came into play because we also wanted to consolidate the Oracle storage onto the NetApp and off of the standalone brick.
There were a few scenarios that were thrown out in the beginning. There were the new Niagara-based Sun T2000's, which were actually RECOMMENDED to us, and I'm still convinced to this day that this is 100% our fault for not doing the homework we should have done pre-purchase. In case you're not familiar, the Niagara chipset was never really designed to run databases. They turned out to be a huge flop during our performance testing until the right engineer/Sun-guru came along and told us about the whole Niagara story.
It was also during this phase that we started parting the seas of IT between the DBA's and the Operations crowd. I had been doing some serious reading into NFS. I have had a lot of success using NFS + VMware for my datastores. We actually started out using iSCSI + VMware, but I added a new large datastore via NFS and the performance was actually better. And the allure of resizing entire datastores on the fly was enough to make me jump in with both feet. So, the DBA's would never conceive of running anything but fiber-channel to high-end storage. So our initial failed phase of testing with the T2000's was largely attributed to using NFS, because it was the likely scapegoat. It wasn't until we hooked the T2000's up to fiber-channel to the same NetApp configuration that we noticed the performance was still terribad, and only marginally better than our current 5+ year old Sun refrigerators.
After some serious digging, NetApp and Sun engineer involvement, and theorycrafting (more like "ok now what the hell are we supposed to do?"), I started planting the VMware bug.
"No way."
"There's no way VMware can handle the workload."
"Are you kidding me?"
etc.
So, I sat quietly as we went on to the next Sun solution. This time, we did some serious homework, in looking at the Sun M4000. Different architecture, built to run high-end databases, etc etc etc.
At the same time, in another corner of the datacenter, I proactively on my own, built a linux VM. Working with one of the DBA's, we installed a copy of 11gR1, and migrated a copy of our production database to it. We divi'ed up some volumes on the NetApp to host the data, and mounted those via NFS directly to the VM. At the same time, we also configured Oracle's new D-NFS client, to make direct connection from the database to the storage, bypassing the kernel layer.
The results....well, they were nothing short of shocking. The linux VM running 11g completely outworked the M4000 with a 1:1 hardware configuration. Mind you, this was only on a 1GbE connection, single path, to a NetApp filer that was already heavily taxed hosting tons of CIFS shares, and NFS ops to VMware. (Testing was conducted using SwingBench and Real Application Testing)
So, where do we go from here? We had a meeting. Everyone was excited/shocked/appalled by the results ("How the hell did a little VM run circles around the boxes that launch the space shuttle?!"). We all threw our hands in the table, and said "GO VMWARE!" and off we went, with a purpose in mind to virtualize Oracle.
We are currently in the process of building out a production-level environment, and once I have cleared it to discuss it more, look for a subsequent post related to the final architecture.
We are definitely excited. Not so much for the consolidation VMware brings to the table, but the agility and easy resilience things like vMotion, HA, and eventually FT, bring to the table, especially for big tier 1 apps such as Oracle OLTP databases.
Stay tuned.
-Nick