Thinking Out Loud: Why Not MPLS-in-IP?

As I was reviewing my list of actions in OmniFocus this morning, I saw an action I’d added a while back to review RFC 4023. RFC 4023 is the RFC that defines MPLS-in-IP and MPLS-in-GRE encapsulations, and was written in 2005. I thought, “Let me just read this real quick.” So I did.

It’s not a terribly long RFC (only about 13 pages or so), but I’ll attempt to summarize it here. (Networking gurus, feel free to correct my summary if I get it wrong.) The basic idea behind RFC 4023 is to allow two MPLS-capable LSRs (an LSR is a label switching router) that are adjacent to each other with regard to a Label Switched Path (LSP) to communicate over an intermediate IP network that does not support MPLS. Essentially, it tunnels MPLS inside IP (or GRE).

“OK,” you say. “So what?”

Well, here’s my line of thinking:

  1. All networking experts agree that we need to move away from the massive layer 2 networks we’re building to support virtualized applications in enterprise data centers. (Case in point: see this post from Ivan on a layer 2 network being a single failure domain.)

  2. Networking experts also seem to agree that the ideal solution is IP-based at layer 3. It’s ubiquitous and well understood.

  3. However, layer 3 alone doesn’t provide the necessary isolation and multi-tenancy features that we all believe are necessary for cloud environments.

  4. Therefore, we need to provide some sort of additional isolation but also maintain layer 3 connectivity. Hence, the rise of protocols such as VXLAN and NVGRE, that isolate traffic using some sort of virtual network identifier (VNI) and wrap traffic inside IP (or GRE, as in the case of NVGRE).

It seems to me—and I freely admit that I could be mistaken, based on my limited knowledge of MPLS thus far—that MPLS-in-IP could accomplish the same thing: provide isolation between tenants and maintain layer 3 connectivity. Am I wrong? Why not build MPLS-in-IP endpoints (referred to in RFC 4023 as “tunnel head” and “tunnel tail”) directly into our virtualization hosts and build RFC 4023-style tunnels between them? Wouldn’t this solve the same problem that newer protocols such as VXLAN, NVGRE, and STT are attempting to solve, but with protocols that are already well-defined and understood?

Perhaps my understanding is incorrect. Help me understand—speak up in the comments!

Tags: , ,

  1. JC’s avatar

    It’d be holy grail if hypervisor switch supports MPLS natively. However responds from vendor is the usual “MPLS is too complicated”.

  2. Richard’s avatar

    What is benefit of MPLS comparing to VXLAN, NVGRE, or STT?
    Are your proposing L2 over MPLS over IP?

  3. Jon Langemak’s avatar

    From my point of view… Encapsulating MPLS inside IP is rather redundant. If you are talking about using MPLS for separation, I assume you are talking about MPLS VPNs. And if you are talking about MPLS VPN’s you are talking about a tunneling protocol in itself. That is, you have two labels. A MPLS label and a VPN label (two instances of the same layer are considered a tunnel in my mind (the packet pushers lads seem to agree)). So really, you are tunneling inside another tunnel. Seems to me that you could just tunnel once (layer 3 overlay for separation) and call it a day… Or am I misunderstanding what you are trying to accomplish?

  4. slowe’s avatar

    JC, the thing about “MPLS is too complicated” is that in this implementation, I think some of the complexities of MPLS could be hidden. Further, the MPLS would only run on the hypervisor softswitches, and would not require any MPLS support on the physical hardware in the data center. (If such support were present though, it would be great to be able to leverage it.)

    Richard, I guess my viewpoint is that MPLS is a proven, understood technology. Rather than re-inventing something new, why not re-use something that already exists?

    Jon, the presence of a label stack does not, in my humble opinion, render the solution redundant. Based on my understanding, we could use MPLS VPNs to provide either Layer 2 *or* Layer 3 service inside the VPNs, and by encapsulating MPLS in IP we eliminate the need for data center operators to rip/replace/upgrade all their equipment to support MPLS. Of course, as I mentioned, I’m not a MPLS/networking expert, so perhaps there’s a key consideration that I’m overlooking.

  5. Jon Langemak’s avatar

    Ah… Ok, so I was missing part of this. This…

    “Further, the MPLS would only run on the hypervisor softswitches, and would not require any MPLS support on the physical hardware in the data center.”

    Clarified your point for me. So I’m not a VMWare guy (yet) but isn’t it true that most hypervisor switches are strictly layer 2? If that’s the case, Im not sure you can implement what you are referring to. It seems that you are suggesting that the hypervisor switch become the ‘PE’ in a standard MPLS environment. Now, I will say at this point that most of my MPLS experience is carrier related so I could be off but bear with me. At some point in the path, you need to tell the device(s) how to get into a certain VRF right? This is done in most cases by handing off an interface (or sub-interface) to a customer (a server in your case I believe) and defining something like this on the interface ‘ip vrf forwarding ‘. Those interfaces you define that on, are in my experience layer 3 interfaces. Which in most cases (as far as I know) you don’t talk layer 3 until you get to a physical switch (from the hypervisor northbound). And when you do, that layer 3 interface is generally shared for many hosts which would sort of defeat the purpose wouldnt it? You could technically trunk with sub interfaces, but Im not sure what that would buy you.

    Layer 2 VPNS in MPLS make sense for the virtual data center, but Im not sure how they could be applied ‘gracefully’ without fully extending MPLS into the DC.

    Hope we are on the same page now. Again, Im no VMWare expert so I could be wrong. I suppose it depends entirely on what you want to do and what features you want to support. I think the end point though is that I think the DC switches will need to support MPLS for this to work since VRFs are a layer 3 concept.

    What do you think?

  6. Stefan’s avatar

    I would be interested to know if this is possible too.

    Maybe a cheeky email to the authors of the RFC could stir an answer?

    Particularly I like the idea of being able to have a stretched network without the need for full MPLS. Although I thought MPLS was purely L3 so I’m not sure how it how it would benefit L2?

    Subscribed for follow up comments :)

  7. slowe’s avatar

    Ivan, thanks for your post—very informative, as usual! I keep forgetting that MPLS labels only have local significance. Would the use of an MPLS label stack help address that concern? In other words, the outer MPLS label has local significance but the inner MPLS label is more of a VNI? Or does that still go against the local significance of MPLS labels? Clearly I need to continue to expand my knowledge of MPLS…

  8. Jon Langemak’s avatar

    It’s sort of confusing to think about. The MPLS labels are locally significant because each router assigns it’s own labels to each prefix. These labels are then advertised through the MPLS network to all of the other MPLS routers. That being said, different routers can use the same label for very different prefixes. In the case of MPLS-VPN there’s a label stack that has a ‘top label’ (for moving the packet across the MPLS cloud) and a ‘VPN label’ that’s used on the PE router to determine which VRF the traffic should end up in. The end point here being that different PE routers (can) have different VPN labels for the same VRF. I assume you are looking for some method to have the VPN label be globally unique?

  9. Ivan Pepelnjak’s avatar

    A second label wouldn’t change a thing (just add a layer of indirection ;) .

    MPLS labels are supposed to be locally unique because that makes them simple to assign (no need to coordinate the label values between MPLS switches). Globally unique MPLS labels would just destroy the whole concept.

  10. isabela’s avatar

    Hi everyone,

    Is it possible to run MPLS over a provider network who is as well using MPLS?
    How would then LDP packets be sent over the provider’s network?