I’ve been promising a post on VPLEX for quite some time now, and now it’s finally here. Having spent some hands-on time with VPLEX this past week, I think I’m finally ready to discuss VPLEX in some detail.
The Basics
First, I need to cover the basics of what VPLEX is as well as what it isn’t. VPLEX is another step in delivering EMC’s vision of Virtual Storage and storage federation. There has been quite a bit of discussion over the difference between storage federation vs. storage virtualization (see here and here for two examples). Personally, like the phrase that Joe Kelly used in a VPLEX post (emphasis mine):
This is the ability to create a consistent view of a volume, independent of its location. This is the core behind Storage Federation.
With this definition in mind, you can see that EMC has already delivered and will be delivering a number of technologies that support storage federation. Take sub-LUN FAST, for example; sub-LUN FAST presents a consistent view of a LUN regardless of the specific storage tier hosting the underlying blocks for that LUN. Blocks can (and will) be migrated automatically between tiers, yet the consistent view of the volume remains unchanged.
VPLEX accomplishes this definition of storage federation through in-band storage virtualization, which I personally think is why so many people are comparing it directly to IBM SVC, HDS USP-V, and NetApp V-Series. Yes, VPLEX does perform storage virtualization—but it’s storage virtualization as part of delivering storage federation.
So, what is VPLEX exactly, then? Using in-band storage virtualization, VPLEX acts as a scale-out cluster delivering both local (within a data center) storage federation and metro (between data centers at synchronous distances) storage federation. A single VPLEX cluster can scale up to four engines; each engine contains two directors. Each director is equipped with loads of RAM (64GB of RAM, if I recall correctly), eight front-end 8Gb Fibre Channel ports, and eight back-end 8Gb Fibre Channel ports. This means a four-engine cluster offers 512GB of cache, 64 front-end 8Gb Fibre Channel ports, and 64 back-end 8Gb Fibre Channel ports. A single VPLEX cluster can support up to 8,000 virtualized LUNs.
VPLEX clusters can be combined into a metro-plex to provide storage federation between two data centers at synchronous data replication distances (less than 100km today). A metro-plex would consist of eight engines (sixteen directors), 1TB of cache, 128 front-end 8Gb Fibre Channel ports, and 128 back-end 8Gb Fibre Channel ports.
In addition to understanding what VPLEX is, it’s also important to understand what VPLEX isn’t. It’s not a replacement for Invista, EMC’s out-of-band storage virtualization solution. It’s not a solution meant only for EMC arrays; VPLEX is also supported for non-EMC arrays, with support for more arrays in the works. And finally, it’s not a VMware-only solution; VPLEX fully supports physical instances of Windows Server, Linux, Solaris, AIX, and other operating systems.
Making it Real: Specifics in the Real World
If I were reading this post, I’d be asking myself right now,”OK, that’s all great and wonderful, but what does it really mean?” I’m glad you asked.
Storage federation as provided by VPLEX means that the storage managed by VPLEX is active, read-writable storage across the entire VPLEX cluster or metro-plex (remember that a metro-plex is a pair of VPLEX clusters separated by synchronous replication distances). This means that if you have a VPLEX Local configuration with 2 engines, all the storage managed by this VPLEX Local cluster is read-writable across the entire cluster. Similarly, if you have a VPLEX Metro configuration with 4 engines (2 in each site), you can have storage that is read-writable in both locations simultaneously.
Consider a traditional storage replication solution: data exists in Site A and the array replicates the data to Site B. While the data is present at both sites, it’s only writable at Site A. Site B is read-only. This is true of every replication solution of which I am aware on the market today. EMC’s own replication products—like SRDF or RecoverPoint—behave this way. Sure, there are workarounds to that limitation, like image access with RecoverPoint. In the end, though, these are workarounds to the underlying replication model. VPLEX breaks that model by allowing you to have writable storage in Site A and Site B at the same time. The same LUN visible in two sites at the same time, writable in both locations.
Just think about that for a moment. You’ll need a clustered file system to take advantage of this underlying storage functionality, but imagine something like Windows Server with Sanbolic’s Melio FS to provide writable Windows LUNs in multiple sites at the same time. Of course, there’s also the VMware use case where VPLEX provides writable access to a VMFS datastore between multiple data centers. Talk about making the hybrid cloud a reality—consider the use of VPLEX Metro between your on-site data center and a vCloud provider’s data center. It would be the ultimate in workload mobility.
And those are just the VPLEX Metro examples. What about VPLEX Local? Ever had to migrate from one storage array to another storage array? Yes, you could use Storage vMotion. Or you could use VPLEX Local and not even have to get the VMware administrators involved—it would all happen under the covers. Think about being able to transparently migrate storage volumes among various arrays within your data center to meet the SLAs of the workload. Need Tier 1 storage? No problem, we’ll use VPLEX Local to transparently migrate you to a VMAX. Don’t need that level of performance or availability any more? No problem, we’ll use VPLEX Local again to transparently migrate to a midrange storage platform.
Want to really freak your brain out? Think about VPLEX with sub-LUN FAST integrated into it…
I have so much more about VPLEX to share, but in the interest of keeping this already long blog post from getting even longer, I’ll wrap it up here. Feel free to share your thoughts or questions about VPLEX in the comments below.
Tags: EMC, Storage, Virtualization, VPLEX
-
Scott, I’m surprised at the sloppiness in this post. You linked to previous posts including one of mine and then ignored their content – which is OK – before loosely paraphrasing something from Joe Kelly’s post in order to emphasize that you believe cache coherence is the core behind storage federation. That certainly puts you into the camp of EMC blogger – just in case anybody was wondering about your objectivity.
Cache coherence may be the core behind what EMC is trying to sell as storage federation, but that would be a company-specific engineering solution and not any sort of definition for the concept of federation. In fact, I’d say we already have a terms words for what you and Joe are talking about and that’s distributed write caching.
The definition you chose for federation: “the ability to create a consistent view of a volume, independent of its location” is far too broad to be useful. You even start out by talking about sub-lun tiering – which definitely should not be included any definition of what storage federation is. Sub-lun tiering is a matter of virtualization within an array and may be done across arrays at some point if those arrays are federated, but it is important to make distinctions about these things or we’ll have people saying things like “virtualized federated pools of disks”, when they could just say RAID instead.
Federation is less about the presentation of volumes than about the group functionality provided by multiple arrays. There are many functions besides presenting LUNs that federation can aggregate or consolidate such as snapshot management and retention, remote copy (many will not want this to be done by distributed cache) and consolidated resource management. Tying federation to virtualization is seriously dumbing it down.
-
Scott, you’ve apparently never had comments from other EMC bloggers other than to cheer you on apparently. I didn’t mean to be offensive, but I did intend to push your buttons. That’s life in the storage blogosphere as practiced by EMC bloggers. It’s part of the territory and if you are uncomfortable with it check with your comrades.
The thing I need to apologize for is not IDing myself. Bad me. Thanks for covering this for me. I’ll get back to you on the other points you raised about different perspectives and all that but I have to run out for a bit.
-
Continuing with my last comment…
The biggest proponent of excluding in-band virtualization was Barry Burke, aka The Storage Anarchist (EMC blogger). The argument about in-line virtualization is that you don’t want to call an address aggregator that maps volume space across “downstream targets” federation. THAT is virtualization, not federation.
Considering that Vplex’s main function appears to be distributed caching – as opposed to virtualizing downstream targets – I’d say that it probably does qualify for consideration as a storage federation appliance. As a marketing guy I really appreciate not wanting to sell this as distributed caching.
But there is the question of whether or not the federation function is intrinsic or extrinsic to the arrays. There are a lot of people that would say storage federation needs to be intrinsic to the array itself and not provided by external products that require a separate layer of management, as Vplex does. The argument I made in a blog post on Vplex was that minimal integration required to support 3rd party arrays demonstrates the lack of integration with arrays.
The definition you attributed to me was arrived at through previous online discussions, although I did summarize it as: “the transparent, dynamic and non-disruptive distribution of storage resources across self-governing, discrete, peer storage systems”
Yours and Joe’s differs considerably in significant ways: “The ability to create a consistent view of a volume, independent of its location.”
Storage resources are more than LUNs; they include things like snapshots, policies, system metadata and all the things a team of storage systems might use to be self-governing, peer storage systems. So, no, I don’t agree that our definitions are very close.
I would argue that if Vplex achieves storage federation it does it through distributed caching much more than through any storage virtualization funtionality it provides. Distributed caching may provide a first level of storage federation much the same way that FAST1 provides the first level of tiering (full-volume, not sub-volume). There are still problems with Vplex being external entities requiring additional management, though.
I’m confused by your saying that you didn’t say distributed cache coherency was the core of storage federation. That’s the way it still reads to me after reading several times. I’m pretty sure that’s what Joe meant when he wrote: “…so something I forgot to mention in the previous post is the idea of Cache Coherence, which provides active-active data sharing. This is the ability to create a consistent view of a volume, independent of its location. This is the core behind Storage Federation.”
Definitions in this business are very important. There was a lot of discussion about this a couple months ago and much cynicism over how watered down the term storage federation would become. I know how confounded things can become having spent a few years of my life writing books about storage and trying to make sense out of the multiple, overlapping vendor terms and finding generic terms to describe concepts. I don’t believe that any definition someone comes up with is as good as any other.
So, you are now part of the discussion, whether or not you wanted to be – and that’s good. You are a smart guy and the fact that you work for EMC is not a problem. I agree and disagree with all of you from time to time and there are always customers to help keep us in line. These discussions can be direct and aren’t always going to be “polite first” , which is something that I’ve become accustomed to and you can too by growing a thicker skin.
-
Thanks Scott, I understand that you are an individual and not part of a tag team of bloggers at EMC. Nonetheless, the aggregate effect of EMC bloggers on competitors is like the good cop, bad cop treatment. You might be a good cop, but you are also now part of that bigger team and that is a change to your blogging context.
We will disagree. The result of distributing storage resources can be much more than creating a consistent view of a volume. There are many important storage management functions that are not involved with how systems access storage.
Thanks for the dialogue here.
-
Great discussion, much learned
. Hope the following post can describe the extra functionalities, if there are. For example snapshotting? Or is it at the moment ‘just disks’. -
Does VPLEX and storage federation work across vendor platforms?




12 comments
Comments feed for this article
Trackback link: http://blog.scottlowe.org/2010/06/07/a-deeper-look-at-vplex/trackback/