Storage Short Take #3

With a collection of various storage-related links from the past few weeks, here’s Storage Short Take #3!

  • There’s a “blog war” brewing between EMC, HP, and NetApp regarding storage virtualization. EMC’s Chuck Hollis seems to have started it with this blog entry in which he claims that storage arrays that provide storage virtualization—like the HP EVA and the NetApp FAS series—have inherent performance problems due to data placement. NetApp fired back in the comments to Chuck’s article as well as with a blog entry here; HP also responded, available here. Naturally, NetApp and HP both indicated that Chuck was wrong in his assumptions, and that it was EMC who would be facing problems soon. It’s an interesting discussion, but one that is way above the likes of mere mortals such as myself. Anyone else care to weigh in?
  • And while we are discussing storage blog wars, HP is stepping up the attack against NetApp’s deduplication functionality. Good luck with that one, HP—NetApp is eating other storage vendors’ lunches in VDI deployments because of deduplication. Regardless of whether’s NetApp implementation is right or wrong, the door has been opened for deduplication on primary storage. Customers want it. It’s up to the rest of the storage community to figure out how to compete with NetApp. If their approach truly is wrong, then prove it. Build the right implementation.
  • I had an interesting talk with some guys from Xiotech a few days ago regarding their Emprise storage arrays, built on their Intelligent Storage Element (ISE) technology. I won’t go into any great detail, but honestly I don’t know how much of what they shared with me is confidential (probably none of it, but I’d rather play it safe). I will say this: if half of what they say is true, this is a really compelling solution. As I get more information, I’ll share it here.
  • I don’t really know if this is more virtualization or storage, but a few weeks ago EMC’s Chad Sakac blogged about EMC’s ability to quickly provision VMs based on writable pointer-based snapshots. This is pretty cool stuff; I’ve blogged about NetApp’s similar technologies in the past as well. Unfortunately, both of these technologies suffer from the same fate: a lack of direct integration with VMware Infrastructure. EMC is in the potentially beneficial position of…well, owning VMware. We’ll see if that turns into an advantage or not.
  • Mario Apicella of InfoWorld thinks that VMotion and FCoE are a match made in admin heaven. What do you think?
  • I also saw some information on the Atrato Velocity1000 storage array, featuring their “Self-maintaining Array of Independent Disks” (SAID) technology. This looks remarkably similar to Xiotech’s technology. Both of them are claiming numerous U.S. patents that are unique to their products. Can you say “patent lawsuit”?

That wraps up this installation of Storage Short Takes. Feel free to add any interesting news, tips, tricks, or other information in the comments.

Tags: , , , , ,

Quick two (really four) cents Scott:

1) thanks for pointing out the post I did on our VDI solution - there are also some others more recently on my blog readers might enjoy. Re: integration with VI - we integrate with VDM pool object creation/deletion/assignment (and VI registration/deletion along with AD object handling) right now, but the API’s haven’t been there for EMC (or NetApp - 3PAR also recently joined this game, other players aren’t really looking at the VDI use case specifically) to get tighter - we’ve both been lobbying pretty hard to get more in future incarnation of VDM and VI. This is an emerging area, with lots of fun stuff. Come by VMworld or the Virtual Congress in the UK to see the latest, of course.

2) On Chuck’s blog, I’m not endorsing it in any way, but do want to make a quick correction - it’s not about performance, but rather utilization. I **think** Chuck’s point (although like some in the blogosphere, he relishes kicking the hornet’s nest) was that each vendor is intrinsically so different, that you can’t ask for “give me a quote for an HA system with 120 15K RPM drives” and then compare them. Performance, featureset and utilization (as well has how these all vary as a function of one another) vary from vendor to vendor. You’re better off to ask the Systems Engineer from each vendor to design a configuration for you, and describe how the features can benefit you. Apples to Apples is near impossible (if you read the comments on the thread - Steve Foskett seems to have groked this).

3) Personally, I try not to attack NetApp’s Dedupe (formerly known as ASIS). Being negative against other’s strengths is a silly game. Each vendor has things they do better than each other, and periods exist until they become standard. I sent an email to Dave Hitz (NetApp co-founder and CTO) the day they released A-SIS: “buy the engineer who came up with that one a steak and a bottle of wine, we’re going to be competing on that topic alone for a year”. It’s a great feature, and count on it (like RAID-6, or thin Provisioning) to become ubitquitous in a couple years. Unfortunately, since it’s easy to pull a “these are not the droids you’re looking for…” move when customers look for feature comparisons (i.e compare checklists), the real question is what configuration will give you what you NEED with the lowest TCO, and more importantly fits your inhouse skillset (or hopefully the vendor includes training).

What I DO think, is that customers fundamentally want (i.e. this is the ideal design) is pre-process, real-time dedupe (i.e. you put stuff in and it’s deduped), not post-process batch dedupe (i.e. you put stuff in, and periodically you run a dedupe process against the “container”). Of course, nothing is free, and dedupe needs CPU, limits throughput, and adds latency, so pre-process (I think of this like a “shredder”) is currently not technologically possible. All dedupe methods share the idea of hash functions against data - and in spite of crazy dedupe claims, they all have roughly the same dedupe ratio - which is a function not so much of technology, but rather commonality, size/variabily of dedupe “unit”, and domain of comparison (i.e. how “wide” you’re comparing the data for deduplication.

NetApp’s way is an interesting alternative to avoid the production impact of performance impact if you run it real-time. Instead, defer the dedupe, and run it periodically as a batch against a FlexVol. EMC also does dedupe today (both source based, and target based, and pre-process and post-process) with Avamar, Avamar Virtual Edition, and DL3D, but we haven’t crack production storage dedupe yet. The downsides of the real-time shredder mechanisms are acceptable with backup targets.

4) Hope to see you at VMworld!

Hi Chad! One quick clarification:

The blog post from Chuck to which I was referring wasn’t the newer post that caught the attention of The Register:

http://www.theregister.co.uk/2008/08/29/emc-blogger-attacks-netapp-and-hp/

Rather, it was an older post talking about data placement (I think the title was “Data Placement Debate” or something similar). True, there is another “blog war” about the utilization post, but that’s another story for another day. ;-)

Oh, and I’ll definitely look you up in Vegas. Let me know if you want to get together for dinner or something…my wife and I have a few nights still open.

Thanks!

Ah - gotcha - didn’t follow the link.

Here’s my two cents on the “data placement debate”.

1) A very speciliazed human can always out-engineer “automatic” and “pool based” layout schemes.

2) some relatively rare cases require very specific envelopes (performance, scale, availabilty, recoverability) and these are well served by very specialized humans. While these situations are rare, often they demand a premium because of the use cases.

3) Most customers don’t have extreme specialist storage folks.

4) the general cases (i.e. what most customers need) are well served by big “easy” buttons.

What I would do if I were a customer is ask to see easy first, then ask to see how I could get the platform to do something specific - but only if you REALLY think you’re going to need it.

For example, when you configure a NetApp filer, you can leave the RAID config to defaults when you create aggregates (the easy button), or you can specify the RAID groups if you want. Likewise on a CLARiiON, you can use the wizards and let them auto-select the config, or specify what you want. Ditto with a Celerra - let it use Automated Volume Manager and storage pools, or create your own dvols/meta/stripes. What IS interesting, is that the NetApp and EMC designs are roughly from the same era. Newer models (think EqualLogic, Pillar, Compellent, Lefthand, or XIV for example), have an extreme pool philosophies. There is no clear “put this here, and don’t let stuff contend for the resources” mechanism. They fall very squarely into “let us figure out where to place stuff”. An iSCSI startup I used to work for built an automted layout algorithm, and I gotta tell you - it’s very complex stuff to automate what an intelligent human would do… you end up having to make some assumptions.

Interesting to see what customers think about this model. EMC and NetApp could further hide the “advanced tab”, but I fell warm and fuzzy knowing the choices are there.

I’ll quote Curtis Preston on on backupcentral.com dedup who has done a tremendous amount of work on backup over the years, several books and is a very sharp guy from an encounter I had with him a few years back.

“If your backups aren’t slowed down, and you don’t run out of hours in the day, does it matter which method you chose? I don’t think so. Which is why I think this argument is kind of pointless. What matters is whether or not it works for you. Test anything with your biggest workload and see how it does.

If the device you buy meets your requirements, who cares what’s under the hood? I don’t think you should have to think about in-line vs post-process. I think you should care about how big it is, how fast it is, and how much it costs. (Remember that cost comes from many hours well beyond acquisition/depreciation cost. You need to factor in how easy the product is to install and configure with your backup software, as well as how easy it is to manage things like lost drives, management growth, etc.)”

Well said Nick, well said. BTW - inline vs. post-process is a totally debateable topic, and a great fit for the NetApp ASIS design - I expect it was the original design target.

I was referring to “production dedupe” which was Scott’s orginal post.

In the same way that there are bad EMC folks that trivialize understanding how our stuff works in the hopes of quickly eliminating competition and winning, in the last couple of months, I’ve run into too many bad NetApp folks that do the same (I would expect the ratio of bad/good is even across all the major vendors, and more slanted to good than bad)

Neither have the customer interest in mind.

Specifically, what I’ve heard said is “I can buy 10% as much infrastructure for production if I do post-process”. While this could potentially be true immediately at time of acquisition with in-line for production (which no one can do right now), and over the long term for (over several acquisition cycles) may perhaps be possible with post-process for production sadly, in the spirit of winning a deal, they neglect to mention that inline and post-process are different. Post-process does require that you can store the pre-processed data, and the operational process to do the dedupe.

Note that post-process and in-line CAN defer an initial acquisition cost for the backup use case, I’m calling out the difference on the production side.

This is the 2009 equivalent of “how much capacity do you need” vs. “how many IOPS do you need”.

Hope to see you at VMworld!

I tend to disagree on the similarities between Atrato’s V1000 and the Xiotech Emprise. They use differing technology to solve the same growing issues in the storage industry. The solutions look similar at face value, while the approach behind the technology is very different. People want quick access, no bottlenecks, scalability, and a low TCO. I fully expect other storage companies to follow suite in the near future. Demand for constantly streaming data is only going to grow. If you consider the fascination with YouTube, MySpace, Face book, IPTV, and a growing Web 2.0 sector you can see that uninterrupted access will begin to rival the need for storage capacity. Both solutions address these issues by creating high density designs that reduce friction and cooling requirements, while providing security, and cost savings. Atrato does this with a patented herringbone alignment of small form factor disks, while Xiotech does not. Xiotech is pretty hush about their architecture, but if you take a look at their website they provide a diagram on their specs page. You can clearly see that the arrangement is not in a V alignment. Atrato uses a completely different fail-over system than Xiotech. Not to mention that Atrato’s FDIR, Self-Healing, and mirroring are the brain child of aeronautic science, where Xiotech is not. This is not a new argument by any means. Atrato has had analysts looking at the similarity, and a UK publication actually did a piece on it. I will see if I can pull that up for you. In any case, the consensus has been that the architecture of both machines is unique. If any patent infringement has occurred it would be on Xiotech’s end as the V1000 was released before ISE was even announced. If anyone is curious Xiotech has a video on their webpage that walks you through the capabilities of the Emprise. If you compare that to Atrato’s white papers, you can begin to see the difference in technology. It may also be important to note that the V1000 and the Emprise are not really an apples to apples comparison. Xiotech is a caching technology, while Atrato is strictly random read-writes. So we are talking traditional IOPS vs. random IOPS. This difference dictates that the companies play in slightly different arenas. Overall both are very compelling technologies, and worth a second look.

k.annmarie,

I’m sure that one looking at the products more in-depth–as you appear to have done–would find numerous differences between the two. Thanks for your insight!