Weirdness with Virtual Ethernet Interfaces on Ubuntu

I’ve been doing some experimenting with virtual Ethernet (veth) interfaces in Ubuntu as part of the ongoing work with network namespaces, LXC, and related technologies. A few times I’ve run into a very weird situation, and I have yet to figure out exactly what’s happening. I thought I might share it here in the hopes that someone else has seen this behavior and knows a) what causes it, and b) how to fix it.

I’ll start with a pretty vanilla installation of Ubuntu 12.04 LTS and Open vSwitch (OVS). When I run ip link list, I get output that looks something like this (click the image for a larger version):


Before adding the veth pair

OK, nothing unusual or unexpected there.

Next, I’ll add a pair of veth interfaces:

ip link add vmveth0 type veth peer vmveth1

Then the output of ip link list looks like this (I’ve circled some of the output to draw your attention; again, you can click for a larger version):


After adding the veth pair

See? The name of the veth peer interface gets garbled up and somehow corrupted. Because of this, nothing works—I can’t use the veth pair to connect network namespaces, or to connect a Linux bridge to OVS, or anything else. Rebooting the system does not fix the problem; only a rebuild seems to get rid of it.

Anyone have any ideas?

Tags: , , ,

  1. Roman Hochuli’s avatar

    Hey Scott

    Did you typed in the command manually or was it some kind of copy paste?
    Does the same thing happen if you run the commands as root himself instead of sudo it?

    Greetings from Switzerland
    Roman

  2. phocean’s avatar

    I don’t know how this happens, but it is a arbitrary memory access that would be worth reporting to the developpers.
    This might be interesting on the security perspective too. I wonder if there is a chance that this memory access is somehow influenced by user controlled data.
    Is the garbage displayed value changing between reboots ? or networking service restarts ? Loading / Unloading veth related kernel modules ?
    It will help spotting the issue.

    And what do you call a rebuild ?

  3. slowe’s avatar

    Roman, I always enter the commands manually, not via copy-and-paste. I haven’t tried as root, but I suspect it won’t change anything. Nevertheless, I’ll give that a try.

  4. slowe’s avatar

    Phocean, as far as I can tell the garbage value doesn’t change in any of those circumstances. However, it’s hard to tell for certain, since the garbage characters could look the same but represent various values behind the scenes. And by a rebuild I mean a complete reinstall of Ubuntu.

  5. Lennie’s avatar

    Which console is this ? Is this the console of your hypervisor ?

    Looks a bit like a Unicode encoding problem.

    Have you tried using SSH to login and check the output that way ?

    You’ll have to check the if the terminal client you are using for SSH supports Unicode. And maybe change it’s settings if it doesn’t work.

  6. phocean’s avatar

    So you mean that after the reinstall the issue does not occur anymore ?
    If the settings are exactly the same and it is not reproductable, it will be hard to analyze.

    It could be that it was just some file system corruption and had nothing to do with the underlying code…

    I have already experienced very weired things like that with corrupted VM…

  7. slowe’s avatar

    Yes, this is the console of the VM in which the issue is occurring. I get the same results via an SSH session to the VM.

  8. slowe’s avatar

    Phocean, I should have been clearer in my response. Reinstalling Ubuntu fixes the issue but does not prevent it from reoccurring.

  9. Lennie’s avatar

    I was surprised you said: reinstall is needed to get rid of it.

    Then I remember you are also using OVS, it is probably OVS which re-creates the device on reboot.

    I just tried it, something like this would do it:

    ip link delete `ls -lA /sys/devices/virtual/net/ | egrep -v ‘lo|vmeth0′ | awk ‘{print $9}’`

    You’d need to run it without the delete first:
    ls -lA /sys/devices/virtual/net/ | egrep -v ‘lo|vmeth0′ | awk ‘{print $9}’

    The output of the command should just be the garbled one, then you can add the delete.

    If there are others, you should add it to the list of ‘lo|vmetho’

  10. slowe’s avatar

    Lennie, I can delete the “junk” interface pretty easily by deleting the corresponding veth peer interface. The issue is that this corruption—if that indeed is what it is—returns when I create the next pair of veth interfaces. I don’t need any special tricks to delete interfaces; what I need is to know how to prevent this from happening.

  11. Lennie’s avatar

    Just don’t do it. ;-)

    No really, it’s the wrong command, newer versions of iproute say:

    Error: either “dev” is duplicate, or “vmeth1″ is a garbage.

    Which isn’t a lot better, but at least it won’t create garbage.

    I’ll see if I can figure out how to use it properly.

  12. Lennie’s avatar

    Actually, found it pretty quickly:
    # create a new namespace:
    ip netns add newns123

    # set up a link between both namespaces:
    ip link add name veth0 type veth peer name veth1 netns newns123

    # see the host side:
    ip link

    # see the other side:
    ip netns exec newns123 ip link

    # When you delete the namespace newns123 the link on the host side is also gone:
    ip netns delete newns123

    ip link

  13. phocean’s avatar

    Ok, then it is very probably a memory corruption : some parts of the veth structure in memory may be overwritten by a wrong pointer when you create the next pair.
    I wish I would have time to investigate in the next few days. Again, this may be exploitable from a security perspective.

    Anyway it is worth reporting the bug to mainstream : in no way this is supposed to happen.

  14. Lennie’s avatar

    Most likely, I think, the memory that gets copied is probably from the ip command and passes that to the kernel as name.

  15. slowe’s avatar

    Lennie, I’m afraid you’ve lost me, sorry! I guess I’m a bit slow here. Are you saying that I should not use “ip link delete vmveth0″ to delete both members of the veth pair? If not, what should I use then?

    Phocean, I’ll see about reporting this upstream. First, I want to see if I can isolate a set of reproducible steps, as that will greatly aid in troubleshooting and tracking down the bug (if indeed there is a bug).

  16. Lennie’s avatar

    @slowe this ‘corruption’ you talk about doesn’t really happen.

    What is happening, I think, is this:

    # you run
    ip link add vmveth0 type veth peer vmveth1

    # then see this strange name when you run
    ip link

    # you are not able to add an other, you get the error: RTNETLINK answers: File exists
    ip link add vmveth3 type veth peer vmveth2

    # but if you run:
    ip link delete `ls -lA /sys/devices/virtual/net/ | egrep -v ‘lo|vmveth0′ | awk ‘{print $9}’`

    # and check ip link you’ll see veth with strange name is gone
    ip link

    # but now you can run the same command again with an other name, you won’t get the error:
    ip link add vmveth3 type veth peer vmveth2

    # when you run ip link again you’ll see the same strange name again:
    ip link

    So the error you got ‘RTNETLINK answers: File exists’ is actually correct, ip link is trying to create the same one, because it copied the same part of memory from the ip-command.

    Anyway, the command you are running isn’t allowed anymore in a newer version, because the command shouldn’t be allowed. It doesn’t have all the needed arguments or the arguments are wrong. Just is the version in 12.04 calls the kernel and creates something silly.

    Well, that is my theory :-)

    Is that a better explanation ?

  17. slowe’s avatar

    Lennie, I get what you’re saying now, but what I don’t get is what is “wrong” about the use of the “ip link add … type veth” command? You indicate that the command shouldn’t be allowed, but why? Because of the issue we’re seeing here, or some other reason?

  18. Lennie’s avatar

    As I mentioned above.

    Clearly ip route with veth isn’t meant to be used in that way. The newer versions say the following if you run that command:
    Error: either “dev” is duplicate, or “vmeth1? is a garbage.

    I also listed a valid use of veth with namespaces above.

  19. Apsu’s avatar

    You’re actually using the command incorrectly, which isn’t very obvious at all from the help or manpage. The correct syntax is:

    ip link add name veth-one type veth peer name veth-two

    For some reason, if you don’t provide “name” after “peer”, it will still accept the command but garble the name of the other end.

    Hope that helps!

  20. slowe’s avatar

    Apsu, that actually does help a lot—thanks for catching that!

  21. Arpita Biswas’s avatar

    Apsu’s trick worked!! These are cases when a pointer can eat you up :D Thanks a ton :)