Thinking Out Loud: The Future of VLANs

It’s interesting to me how much the idea of a VLAN has invaded the consciousness of data center IT professionals. Data center folks primarily tasked with managing compute workloads are nearly as familiar with VLANs as their colleagues primarily tasked with managing network connectivity. However, as networking undergoes a transformation at the hands of SDN, NFV, and network virtualization, what will happen to the VLAN?

The ubiquity of the VLAN is due, I think, to the fact that it serves as a reasonable “common ground” for both compute-focused and networking-focused professionals. Need a logical container for new workloads? We can use a VLAN for that. VMware is partially to blame for this—vSphere (and its predecessors) made it incredibly easy to use VLANs as a way of logically “partitioning” compute workloads on the same host. (To be fair, it was really the only tool available to accomplish the task at the time.)

Normally, finding a “common ground” is a good thing…until that common ground starts to get pushed beyond where it was intended to be used. I think this is where VLANs are now—getting pushed beyond where they were intended to be used, and that strain is the source of some discord between the compute-centric teams and the networking-centric teams:

  • The compute-centric teams need a logical container by which they can group workloads that might run potentially anywhere in the data center.
  • The networking-centric teams, though, recognize the challenges inherent in taking a VLAN (i.e., a single broadcast domain) and stretching it across a bunch of different top-of-rack (ToR) switches so that it’s available to any compute host. (The irony here is that we’re using a tool designed to breakdown broadcast domains—VLANs—and building large broadcast domains with them.)

What’s needed here is a separation, or layering, of functions. The compute-centric world needs some sort of logical identifier that can be used to group or identify traffic. However, this logical identifier needs to be separate and distinct from the identifier/tag/mark that the networking-centric folks need to use to build scalable networks with reasonably-sized broadcast domains. This is, in my view, one of the core functions of a network encapsulation protocol like STT, VXLAN, or NVGRE when used in the context of a network virtualization solution. Note that qualification—a network encapsulation protocol alone is like the suspension on a car: useful, but only in the context of the complete package. When a network encapsulation protocol is used in the context of the complete package (a network virtualization solution), the network encapsulation protocol can supply the logical identifier the compute-centric teams need (this would be VXLAN’s 24-bit VNI, or STT’s 64-bit Context ID) while simultaneously allowing the network-centric teams to use VLANs as they see fit for the best performance and resiliency of the network.

<aside>By the way, while using a network encapsulation protocol in the context of a network virtualization solution provides the decoupling that is so important to innovation, it’s important to note that decoupling does not equal loss of visibility. But that’s a topic for another post…</aside>

The end result is that compute-centric teams can create logical groupings that are not dependent on the configuration of the underlying network. Need a new logical grouping for some new line-of-business application? No problem, create it and start turning up workloads. Meanwhile, the networking-centric teams are free to design the network for optimal performance, resiliency, and cost-effectiveness without having to take the compute-centric team’s logical groups into consideration. Need to use a routed L3 architecture between pods/racks/ToR switches using VLANs? No problem—build it the way it needs to be built, and the network virtualization solution will handle creating the compute-centric logical grouping.

At least, that’s my thought. All my Thinking Out Loud posts are just that—me thinking out loud, providing a springboard to more conversation. What do you think? Am I mistaken? Feel free to speak up in the comments below. Courteous comments (with vendor disclosures where applicable, please) are always welcome.

Tags: , ,

  1. TJ’s avatar

    I like your “thinking out loud posts” – very thought provoking :).

    I’d vote to add “MAC Routing” options like PBB-EVPN to the discussion … and maybe muddy the waters with L3 routing being used to build L2 paths (TRILL, OTV) …


  2. Mike Laverick’s avatar

    Right on the money. I was speaking to client earlier this year – who bemoaned the number of VLANs they had, and time it took to provision new networks. It’s almost like each new system demands its own VLAN, just like when ever new application demanded a new set of physical servers.

    So I suggested to the customer there was away of having VLAN functionality without the penalty of VLAN configuration. Suddenly they bristled. But VLANs are the bedrock of our security, they exclaimed.

    I suspect folks will hang on the VLAN comfort blanket much like they held on to physical servers. I like the mention of irony – there’s another one – in a world of technological change there’s nothing more conservative than infrastructure – perhaps because we recognise that everything stands (or fails) on that infrastructure…

  3. Travis Kensil’s avatar


    I definitely agree that many of the new networking tech. coming out will change the landscape, especially in diverse IT silos of compute/network folks. I think the challenge will be adoption of them. It took years before most IT shops had the equipment and knowledge to successfully deploy VLANs, now I think the VLAN trend has caught on and you see much more widespread use. These new techs. coming out will probably be immediately embraced by large enterprises and service providers but won’t see much other use until the hardware and knowledge in the shops catches up and by then something new will be out. I also think VLANs will be around for a long time yet as some of the challenges these new solutions address are not necessarily challenges some enterprises are facing. Just my $0.02, good post though!

  4. Larry Orloff’s avatar

    While I agree with you conceptually, in reality it’ll be a slow move to do this. Additionally many orgs will never do it due to many reasons, the first being cost. Past that, many orgs won’t have a need for such things. It’s similar to VPLEX in the sense that it’s a great technology, but it’s use cases are limited to actual customer need versus perceived need. Large Orgs, and hosting providers will take full advantage of these technologies, but mid to smaller won’t even pay it any mind.
    The larger piece is in regards to the infra and who manages it. Unless you have progressive people in charge at the top and controlling the infra, status quo will always stay. IT admins fear change, especially in larger orgs. Think how hard it is to get “startup” technologies into them.
    It’s a great idea, and many advantages, but the true adoption won’t be what the buzz is.

  5. Mimmus’s avatar

    I don’t see any need because computer-centric folks have to ask for a new VLAN.
    Probably because in my company network- and computer- folks are the same :-)

  6. slowe’s avatar

    TJ, I’m curious—how would you see technologies like TRILL and OTV changing the way VLANs are used? PBB (and SPB, I suppose), with the idea of “double tagging” could have a similar impact, but TRILL and OTV aren’t as clear to me. How do these technologies decouple the logical container for compute from the use of VLANs on the underlying physical network?

    Mike, Travis, Larry: I’m grouping all of you together here because you all seemed to touch on the same topic—adoption rate. I agree that the adoption rate of technologies like network virtualization will probably be slow at first, but how is that different from the adoption rate of compute virtualization (which was also slow at first)?

    Mimmus, as long as you don’t start arguing with yourself then you should be in pretty good shape. :-)

  7. Mike Laverick’s avatar


    Yep. I agree with you on the adoption point. But with that said for a technology as disruptive as compute-virtualization is its run-rate has been exceptional. From almost nothing in 2003/4 to being the defacto way of doing things by the end of the decade. Maybe that’s because server virtualization was such a no-brainer, and TCO/ROI was so obvious to people – barriers were such things like “will it perform” or “will my application owner let me). It’s interesting to note that other spurs from server virtualization – desktop virtualization and cloud – haven’t experienced anything like the same adoption rate.

    Perhaps its it bit like storage market – unless you can demonstrate 10x cost savings or 10x performance improvements – adoption rates are going to run-up against fear of the unknown; hostility to change…

    So why hasn’t desktop virtualization or cloud taken off as quickly. Perhaps its about usage cases. Folks have to see a really compelling reason to change, and know-way Sherlock reasons to change. If they can’t see the usage-case or benefits to change they will stay with what they have…

    Not saying this is good thing, or how it should be – but perhaps its reality that needs to be acknowledged?

  8. Travis Kensil’s avatar


    Good conversation going here for sure! Personally, I think beyond adoption the issue is really about problem solving; do these new network techs. solve issues you are experiencing today in your business and are the costs justified?

    I think compute virtualization is a bit different because whether you are a small or big company you could understand the benefits and it clearly addressed many issues with server sprawl, config. management, power/cooling, deployment times, etc. It basically provided a common solution to common issues across all business types/sizes.

    Now with network virtualization though, some of these issues being addressed are really only issues a large enterprise or service provider might experience; stretching networks across multiple data centers, etc. For the majority of your average small/medium or perhaps even large business though they don’t really have those types of challenges/concerns and the VLAN world satisfies their requirements and budget.

    I think network virtualization will have to mature, develop some standards, become more affordable before wide-spread use will occur and even then its not guaranteed. I think for most businesses the entry-cost and knowledge to use these kinds of tech. will be a deal breaker for a while, especially for those that have just recently gotten into VLANs. VLANs are so popular now because its cheap, relatively easy to setup/deploy and provides reasonable levels of logical separation; I don’t think network virtualization can claim the same and as long as it can’t I think adoption outside of large/service provider levels will be minimal.

  9. TJ’s avatar

    (First – Dang, I mixed OTV and FabricPath – meant to have TRILL and FabricPath grouped together. OTV goes more with PBB-EVPN.)

    As I said, TRILL/FP was mentioned solely to muddy the waters in terms of our networks undergoing transformations at L2 – killing sTP is almost a religious topic in some environments. And may be an enabler for the more flexible/capable networks we need moving forward.

    PBB-EVPN (and SPB) are directly related to VLAN stretching and may change the way we view VLANs; whether that is a good idea or not will vary wildly with the implementation & specifics.

    Oh, and I think Mike L is correct – fear and trepidation rules until motivation rises to a breaking point. (Good analogy for IPv6 there …)

  10. Josh’s avatar

    Also thinking out loud… It seems that the VLAN can, and currently is, doing the job of both compute and network isolation. The current implementation just doesn’t have the address space for some larger organizations and is why they’re feeling the strain. It seems a lot of the mentality to address this limitation is to create another layer with another protocol/standard via tags or encapsulation. While this very powerful and flexible, it is also adding to the complexity of the datacenter.

    Speaking strictly in theory, what about extending the VLAN tag address space to something like that of VXLAN or STT (VLAN2.0?) Or even using VXLAN/STT in place of the current tag.

    I would think this would provide both the compute and network containerization (is that a word?) in a single layer of abstraction. This would also be easier to understand and implement for both compute and network IT folk and may even lead to a speedier adoption. The single implementation would also simplify security as this is already the foundation of many, if not most, company strategies.

    Obviously, these would be limited by market adoption, value prop, ROI, etc., as mentioned before. Like I said, just thinking out loud. I’d love to hear any thoughts on this.

  11. Dave Walker’s avatar

    Good article. VLANs were originally intended to segment networks so as to cut down on wide broadcast chatter, as you say, so it’s fair to say they’ve been used for purposes other than their intended one, for some considerable time ;-). Also, I remember that there was a flurry of activity around 2002-ish involving research into crossing VLANs by using BPDUs as (effectively) a covert channel, which is where PVSTP came from. (I agree with other commenters that STP is a horror – in a world when SDN meant something else entirely (so, pre 2010), give me EAPS or SMLT any day.)

    It’s also interesting that you pick up particularly on VMWare and their use of VLANs, as it’s a little known fact that – when I did the research, anyway – the only network switch of any kind, from any vendor, that I could find which has any degree of VLAN separation assurance requirement within the scope of its Common Criteria Target of Evaluation is… the vSwitch in ESX.

    VLANs have the double-edged sword to deal with, of being intrinsic to Ethernet by being embedded in the frame preamble; while they’re easy to handle when you’re on Ethernet and have no impact on header format or packet size, translating them to other transport protocols get to be fun. As the train of thought goes, here, when it comes to decoupling networking layers harder with things like SDN, it’s non-obvious where VLANs will best fit. There’s definitely more to discuss, here.

  12. Jim’s avatar

    What strikes me about all the dialogue associated with this topic is the skimming over of the following…

    “By the way, while using a network encapsulation protocol in the context of a network virtualization solution provides the decoupling that is so important to innovation, it’s important to note that decoupling does not equal loss of visibility. But that’s a topic for another post…”

    Overlay networks (decoupling) do not necessarily mean a loss of visibility, but absolutely lead to more complexity.

    You stated “What’s needed here is a separation, or layering, of functions. The compute-centric world needs some sort of logical identifier that can be used to group or identify traffic. However, this logical identifier needs to be separate and distinct from the identifier/tag/mark that the networking-centric folks need to use to build scalable networks with reasonably-sized broadcast domains. This is, in my view, one of the core functions of a network encapsulation protocol like STT, VXLAN, or NVGRE when used in the context of a network virtualization solution.”

    The requirement for broadcast do not go away, but instead need to me managed in another way… i.e. emulated via multicast groups. Having seperate environments that meet the needs of the two different groups (compute-cetric and network-centric) adds layers of complexity to understanding where an issue lies within the network environment. Ultimately, someone will get the call at 2am when the DC environment is melting down. At that point when attempting to peel back multiple layers of encapsulation and protocol manipulation the tools need to be there… the expectations are too high for todays DCs and the more we layer on the abstraction the less the intuitive identifying a problem becomes. All the hype around overlay networks sound great on paper (fun in a lab), but when you talk about scaling them out to support 100s (maybe 1000s) of logical networks layered upon each other… it is going to be a handful to manage.

  13. Brian’s avatar

    I am endlessly amused when I see the same problems appear again and again at scale boundaries. [Okay, so I amuse easily.] The very same problems we saw as the corporate networks appeared, the use of Hosts, then single label domain, then fully qualified domain names, and now the “cloud” where we bring all the attendant problems of the internet as a whole inside our (theoretical) control. Wow!

    So crack open those old RFC’s and engineering manuals. You’re going to need to understand the why’s and wherefore’s again.

  14. slowe’s avatar

    Josh, expanding the available address space for VLANs beyond the existing 12-bit space would be great, but is highly improbable. There are so many ASICs, so many software applications, middleboxes, etc., that are designed around that 12-bit VLAN tag that changing it would be monumental, and likely to create numerous interoperability and adoption challenges. Would be nice, though, wouldn’t it? :-)

    Jim, your key argument seems to be that the potential loss of visibility offered by an overlay network outweighs the advantages that an overlay network provides. That’s a common argument I hear. I’d encourage you to read some other pieces that have been written on this topic—you might try EtherealMind’s recent series (starts here or have a look at Network Heresy’s latest post (found here; as a disclaimer, please note that I work for VMware and provided editorial feedback on that article). I’d love to hear your feedback or thoughts after reading those articles. Thanks!

    Brian, care to elaborate? :-)


Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>