IBM’s XIV

22 08 2008

I finally got a chance to learn about XIV. I was dragged into an IBM product presentation recently, so I figured I would summarize the one thing not covered on the NDA here :)

What is XIV?

Essentially, it’s a disk storage device that uses only SATA drives but gets a high number of IO/s out of them by spreading the reads and writes across all disks. Every LUN you create will be stretched across every disk in the array. Instead of using standard RAID to do this, XIV has a non-standard algorithm that accomplishes the same thing on a larger scale.

They build every system exactly the same way- each system contains a bunch of nodes of 12 drives each with their own processors and memory. It’s all off the shelf hardware in a node- pentium processors, regular ram, and sata drives. Not enterprise class on its own, but because of the distribution system they’ve worked out, you get all the performance of all the drives for all your reads/writes.

Scalability is done by hooking new systems to old ones through the 10GB switch interlink ports. They say that as newer communication tech becomes available, this will follow along (so eventually they will support infiniband). Also, when you add a system to the cluster, the balancing of data is automatic.

How is this different?

The big change here is in the way they put data on their disks. They’ve re-invented the wheel a bit, but for a reason. The performance you can get out of low cost low end drives in parallel is very good. Normally, I would never tell people that SATA is appropriate for databases or email, but XIV claims to be fast enough. I imagine we’ll see some benchmarks soon.

The first thing I asked about was parity space. XIV puts parity info over the whole array, so with 120 1TB drives, you get 80TB addressable space. Also, because rebuilding a 1TB drive from parity is normally a really intensive operation that generates many reads across the RAID, I asked about how they handle rebuilds. They claim that they can rebuild a 1TB drive from parity in about half an hour because all the parity data is being read from all the other heads simultaneously.

This sounds good, but I wonder if a failure and rebuild will slow down your entire production environment instead of only the raid where the drive failed. Also, in the event of an entire node failing with 12 drives, would that mean a 6 hour rebuild that affects the whole production array? If they have some way of prioritizing production IO, then I am satisfied. I don’t know if they do though.

Snapshots

Normal “copy on write” snapshots create extra writing traffic- every snapshot is another write that must be committed to disk before the acknowledgment is sent to the host. XIV uses a snapshot algorithm called “redirect on write” to avoid this problem and allow larger numbers of readable/writable snapshots.

They create a snapshot LUN that initially points to the real data, and when a change is made to the source, they write the new data to unused space and point the production LUN there while leaving the snapshot pointed at the old data. Netapp used a different algorithm to solve the same problems inherent to “copy on write” traditional snap shots that launched them into success in the enterprise storage market years ago.

Other advanced features

The box is delivered with all functionality enabled, which is an interesting move considering every other vendor I’ve dealt with makes most of their money from software. They include mirroring, thin provisioning, and a weird one time only type of virtualization that sits between the hosts and the old storage and reads all the data off the storage while continuing to pass the IO through transparently.

Questions

If someone from XIV (or more likely IBM) is reading this, I want to know more details about your mirroring and your workload prioritization:

  • Do you support synchronous, asynchronous, and asynchronous with consistency group mirroring? What about one to one, one to many, and many to one configurations?
  • Do you have a way to prevent disk rebuilds from taking disk resources that are needed by production apps?




Follow up: Gizmodo reporter banned for prank.

14 01 2008

It sounds like the guy I wrote about last week that was pranking salespeople at CES is banned for life.

Good.





Off topic: Gizmodo acts stupidly at CES

11 01 2008

OK, I know you are young. I know you are irreverent. I know you think you’re cooler than a bunch of old stodgy sales people at CES. That’s no excuse for being an ass-hat and pulling pranks on people who are just trying to do their job:

Click here to see Gizmodo’s account of how they “pwned” CES.

Sometimes I’m glad our industry is not as old and unexciting as say commodities brokerage. I like the fact that some of the most influential people in the technology industry are irreverent bloggers with a sense of humor. People like Arrington or Malik, however, have a basic sense of boundaries and decency, and would never stoop to annoying (admittedly uncool) sales people to get a laugh from their audience. For shame!

There’s a reason I removed you insufferable tools from my RSS reader a few months ago. This is just another extension of it.





Offtopic: VMWare IPO

13 08 2007

I’m going to digress from storage for a moment to discuss current events. In case you didn’t know, VMWare is going to become a public company tomorrow. I have been following this for several months as both an amateur investor and someone who deals with VMWare professionally, and I’ve been seeing lots of questions online about the IPO so figured I’d put together a quick post about some of the basics.

First, VMWare sells a software suite that allows multiple workloads to co-exist on the same Intel hardware. This is significant because Intel servers normally can not run more than one application at the same time, and Intel hardware is getting more powerful faster than applications can grow their basic requirements. Other platforms (like Unix and mainframe) were built from the ground up to do more than one thing at a time, but Intel cut its teeth in the desktop market, so did not inherit this quality. Now that Intel servers are powerful and reliable enough to trust many important company applications to, VMWare helps companies bring their average resource utilization from 10% up to 80% or higher by consolidating many light workloads onto the same physical machine.

VMWare is owned by EMC, a prominent storage solutions company. VMWare has been growing by leaps and bounds, and EMC wants to ensure that their investors can clearly see this jewel in their crown. Thus, they have decided to spin off about 10% of the VMWare stock publicly. Recently, Intel and Cisco both stepped up to the plate to buy a piece of VMWare before it went public.

EMC bought VMWare in 2004 for a steal, but had to agree to keep their noses out of VMWare’s business. This is relevant to VMWare’s bottom line because the biggest competitive differentiator they have with all the other virtualization solutions (like Xen and Microsoft) is that due to their two year head start, they have a massive list of solutions they’ve worked hard to ensure compatibility with (which includes some serious competitors to EMC). If your company uses a mainstream application and wants to run it under VMWare, chances are they’ve tested it and invested time and money into making sure it will work. Of course, they also have a head start in some of the niftier features like the ability to move around working applications from one server to another, but these features will eventually be canon for all virtualization while their partner ecosystem will still be years ahead of their competitors.

I think this addresses some common questions I’ve seen about VMWare and this IPO, but if anyone needs clarification, this is a Q&A blog, so ask away.








Follow

Get every new post delivered to your Inbox.