More fun with veth, network namespaces, VLANs – VI – simple 802.1q VLAN, routes and ARP

My present posts on veth-devices and Linux network namespaces continue a post-series I have written in 2017 about virtual networking and veth-devices. The last two posts of the present series

More fun with veth, network namespaces, VLANs – IV – L2-segments, same IP-subnet, ARP and routes

More fun with veth, network namespaces, VLANs – V – link two L2-segments of the same IP-subnet by a routing network namespace

were a kind of excursion:

We studied the effects of routes on ARP-requests in network namespaces where two or more LAN-segments terminated – with all IPs in both segments belonging to one and the same IP-subnet. So far, the considered segments in this series were standard L2-segments, not VLAN-segments.

Despite being instructive, scenarios with standard L2-segments are a bit academic. However, they get a more practical meaning when you replace the virtual L2-segments by virtual VLAN-segments. The configuration of virtual VLAN-scenarios based on Linux veth- and bridge-devices is our eventual objective. In forthcoming posts we will consider virtual VLANs realized with the help of veth-subdevices. Such VLANs follow the IEEE 802.1q standard for the Link layer.

In the present post I first want to remind you of basic VLAN settings for veth devices. Afterward we will connect two network namespaces by a veth-based VLAN-segment. Each of the namespaces will get exactly one unique IP-address. Both IP-addresses will belong to the same IP-subnet. Even such a very simple configuration comes with a potential ambiguity for packet emission. The Linux kernel therefore must make some decisions.

We will investigate the impact of route definitions in the namespaces on the emission of tagged and untagged ARP-requests and ARP-replies. All in all we will cover 36 different route configurations.

The result for our simple VLAN-configuration will be that routes determine which of the interfaces of a veth peer device is used to send an ARP-request – if the interface is not specified otherwise. Thereby the route decides whether the ARP-request gets a VLAN-tag or not. Route definitions in our setup also decide whether a reply to an arriving ARP-request is created or not. Furthermore, we will see that we may have to deactivate ARP on the veth trunk interface to prevent the exchange of untagged packets. These results must, however, be verified in other more complex configurations.

VLAN related sub-devices of a veth peer device

A veth connection has two endpoints, also called peer devices. I will use the terms “veth endpoint”, “veth peer device” or simply “veth-peer” as synonyms when I refer to one of the two virtual network devices making up a veth connection. When I speak of the “veth device” I refer to the whole connection arrangement, i.e. the connection line and both peer devices.

In an old post I have already described what we can do to create a VLAN-interface of a veth-endpoint. We take a veth peer device, create a sub-device of type “vlan” and add a VID to it.

Let us name the two endpoints of an example veth-device veth1 and veth2. Then supplementing veth1 with a virtual VLAN-interface requires a command sequence like

netns1:~ # ip link add link veth1 name veth1.10 type vlan id 10
netns1:~ # ip link set veth1 up
netns1:~ # ip link set veth1.10 up

By repeating these commands for further VLAN IDds [VIDs] veth1 will get multiple distinct interfaces for different VLANs (see a graphics below for two subdevices). Packets moving from VLAN-related veth sub-devices in the direction of the veth connection line will get a tag corresponding to the VID. The peer device is LAN-aware in the sense that a VLAN-related sub-device only receives and handles packets which arrived at the peer device with a tag fitting its defining VID. A packet with a tag not fitting to any of the VLAN sub-devices is dropped.

Below I will often use the term “VLAN-interface” of a veth-endpoint when I refer to a peer’s sub-device of type “vlan”. The idea behind it is that I abstract from the complexity of the interaction between a sub-device and the main veth peer device. I regard the whole thing as a unit having multiple interfaces. Some interfaces (namely those related to sub-devices of type vlan) trigger a tagging process such that we get tagged packets of Ethertype 802.1Q (0x8100) passing through the veth-endpoint device into the connection line connecting the two veth peer devices.

Addendum, 22.03.2024:
Actually, one should not assume that packets entering a veth-peer’s VLAN-interface from a namespace or packets leaving via a VLAN-interface into a namespace have a tag. The interface just defines a certain “access port” to the veth peer device which leads to tagging at packet ingress. Tagging occurs somewhere during the transfer between sub- and main-device. We need not care about the details. Neither should you assume that a packet leaving a veth-peer’s VLAN-interface to a namespace or a bridge is tagged. We will actually see (with the help of tcpdump) that originally tagged packets which entered a peer device through the veth connection line have their tags stripped off when they pass the VLAN-interface into the surrounding namespace.

The following drawing shows schematically whats going on:

Why is this important? Because it will later make a substantial difference whether we add a veth VLAN-interface to a Linux bridge or whether we add a veth-peer device without any VLAN-interfaces to a Linux bridge.

“Trunk interface” of a veth-endpoint

Even if we have defined VLAN-interfaces of a veth-peer device we can always send a packet directly through the original interface of the peer device. veth1 remains a fully capable (virtual) Ethernet device independent of the fact that we have added some sub-devices. I will call the original interface of the peer device, e.g. veth1, the trunk interface of a veth-endpoint.

The trunk interface will transport tagged packets, which arrived via subdevices, to the other peer device of the veth-connection. But, as we will see, the trunk interface itself is also able to emit and receive untagged packets. This happens when either an arping option or a route specifies this particular interface (see below).

We refer to a general interface of a veth-endpoint (including the trunk) below by the abbreviation IF.

IP-assignment to veth-subdevices – the standard option and resulting ambiguities

When we created two sub-devices (with different VIDs) of a veth-endpoint, then we actually have two choices regarding the assignment of IP addresses. I first discuss the standard option, which assigns just one IP-address to all interfaces of a veth peer device:

Option 1: Standard option

This option should in my opinion the preferred way to handle VLANs with veth. By using it we assign an IP-address to the veth peer device. A proper command would be

netns1:~ # ip addr add 192.168.5.0/24 dev veth1 

Note that this leads to an automatically created general route to 192.168.5.0/24 via veth1. This route may have to be changed to enforce packet creation via a tagging sub-device.

The important point is: This IP-address will also be used when Ethernet packets with VLAN-tags are created, i.e. when a VLAN-specific sub-interface is selected to address certain destinations.

The veth endpoint supplies all required information to the headers of the Ethernet-packets, to ARP-sections or enclosed IP-packets – according to which of its interfaces was chosen.

Ambiguities
Obviously, this option comes with an ambiguity regarding the choice of an interface for packets to remote destinations. The key question is:

Which interface of the veth peer device should be used to create and send packets to a destination characterized by an IP and/or a MAC? The options are: the trunk interface for untagged packets or a VLAN-interface.

This question has to be answered both for ARP-packets and higher level protocols. It has to be answered even in situations with deactivated forwarding in the namespace.

To fully understand the requirement to resolve the ambiguity by the Linux kernel the following information may be helpful:

The MAC addresses of all interfaces of a veth peer device are identical.

The result of the “ip a” comand for a veth peer device with three sub-interfaces typically would look like this:

netnsx:~ # ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: vethx@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 8e:79:71:5e:92:a3 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 192.168.5.1/24 brd 192.168.5.255 scope global veth1
       valid_lft forever preferred_lft forever
3: vethx.30@vethx: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 8e:79:71:5e:92:a3 brd ff:ff:ff:ff:ff:ff
4: vethx.40@vethx: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 8e:79:71:5e:92:a3 brd ff:ff:ff:ff:ff:ff
5: vethx.50@vethx: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 8e:79:71:5e:92:a3 brd ff:ff:ff:ff:ff:ff
netnsx:~ # 

A veth-endpoint should therefore really be regarded as one unit with multiple interfaces [IFs] (including the trunk interface). The result of the ip-command above shows that the Linux kernel numbers the interfaces. While a IP/MAC/IF – combination is unique, a IP/MAC-combination in our configuration is not!

This underlines the necessity of giving the Linux kernel clear information about which interface to use in a communication with some network destination – including routers.

Continue reading

More fun with veth, network namespaces, VLANs – V – link two L2-segments of the same IP-subnet by a routing network namespace

During the last two posts of this series

More fun with veth, network namespaces, VLANs – IV – L2-segments, same IP-subnet, ARP and routes

More fun with veth, network namespaces, VLANs – III – L2-segments of the same IP-subnet and routes in coupling network namespaces

we have studied a Linux network namespace with two attached L2-segments. All IPs were members of one and the same IP-subnet. Forwarding and Proxy ARP had been deactivated in this namespace.

So far, we have understood that routes have a decisive impact on the choice of the destination segment when ICMP- and ARP-requests are sent from a network namespace with multiple NICs – independent of forwarding being enabled or not. Insufficiently detailed routes can lead to problems and asymmetric arrival of replies from the segments – already on the ARP-level!

The obvious impact of routes on ARP-requests in our special scenario has surprised at least some readers, but I think remaining open questions have been answered in detail by the experiments discussed in the preceding post. We can now move on, on sufficiently solid ground.

We have also seen that even with detailed routes ARP- and ICMP-traffic paths to and from the L2-segments remain separated in our scenario (see the graphics below). The reason, of course, was that we had deactivated forwarding in the coupling namespace.

In this post we will study what happens when we activate forwarding. We will watch results of experiments both on the ICMP- and the ARP-level. Our objective is to link our otherwise separate L2-segments (with all their IPs in the same IP-subnet) seamlessly by a forwarding network namespace – and thus form some kind of larger segment. And we will test in what way Proxy ARP will help us to achieve this objective.

Not just fun …

Now, you could argue that no reasonable admin would link two virtual segments with IPs in the same IP-subnet by a routing namespace. One would use a virtual bridge. First answer: We perform virtual network experiments here for fun … Second answer: Its not just fun ..

Our eventual objective is the configuration of virtual VLAN configurations and related security measures. Of particular interest are routing namespaces where two tagging VLANs terminate and communicate with a third LAN-segment, the latter leading to an Internet connection. The present experiments with standard segments are only a first step in this direction.

When we imagine a replacement of the standard segments by tagged VLAN segments we already get the impression that we could use a common namespace for the administration of VLANs without accidentally mixing or transferring ICMP- and ARP-traffic between the VLANs. But the results in the last two previous posts also gave us a clear warning to distinguish carefully between routing and forwarding in namespaces.

The modified scenario – linking two L2-segments by a forwarding namespace

Let us have a look at a sketch of our scenario first:

We see our segments S1 and S2 again. All IPs are memebers of 192.168.5.0/24. The segments are attached to a common network namespace netnsR. The difference to previous scenarios in this post series lies in the activated forwarding and the definition of detailed routes in netnsR for the NICs with IPs of the same C-class IP-subnet.

Our experiments below will look at the effect of default gateway definitions and at the requirement of detailed routes in the L2-segments’ namespaces. In addition we will also test in what way enabling Proxy ARP in netnsR can help to achieve seamless segment coupling in an efficient centralized way.

Continue reading

More fun with veth, network namespaces, VLANs – IV – L2-segments, same IP-subnet, ARP and routes

In the course of my present post series on veths, network namespaces and virtual VLANs

More fun with veth, network namespaces, VLANs – III – L2-segments of the same IP-subnet and routes in coupling network namespaces

More fun with veth, network namespaces, VLANs – II – two L2-segments attached to a common network namespace

More fun with veth, network namespaces, VLANs – I – open questions

we have started to study a somewhat academic network configuration:

We have attached two L2-segments to a common network namespace, with all IPs of both segments belonging to one and the same IP-subnet class.

The IP-setup makes the scenario a bit peculiar. It is not a situation an admin would normally create. Instead the IPs of each segment would be members of one of two distinct and different IP-subnets.

However, our scenario already taught us some important lessons to keep in mind when we approach our eventual objective:

Sooner or later we want to answer the question how to configure virtual VLAN configurations, in which veth-subdevices for tagged VLAN lines terminate in a common and routing namespace.

By our academic scenario we found out that we need to set up much more specific routes than the standard ones which are automatically created in the wake of “ip addr add” commands. Despite the fact that forwarding was disabled in the coupling network namespace! Only with clear and fitting routes we got a reasonable behavior of our artificial network with respect to ICMP- and ARP-requests in the last post (III).

Well, this is, in a way, a platitude. Whenever you have a special network you must adapt the routes to your network layout. This, of course, includes the resolution of ambiguities which our scenario introduced. The automatically defined routes were insufficient and caused an asymmetry in the ICMP-replies in the otherwise very symmetric configuration (see the drawing below).

Nevertheless, the last post and also older posts on veths and virtual VLANs triggered an intense discussion of two of my readers with me concerning ARP. In this post I, therefore, want to look at the impact of routes on ARP-requests and ARP-replies between the namespaces in more detail.

The key question is whether and to what extend the creation and emission of ARP-packets via a particular network interface may become route-dependent in namespaces (or on hosts) with multiple network interfaces.

While we saw a clear and route dependent asymmetry in the reaction of the network to ICMP-requests, we did not fully analyze how this asymmetry showed up on the ARP-level.

In my opinion we have already seen that ARP-requests triggered within ICMP-requests showed some asymmetry regarding the NIC used and the segment addressed – depending on the defined routes. But the experimental data of the last post did not show the flow of ARP-replies in detail. And we only regarded ARP-requests triggered by ICMP-requests. I.e. we watched ARP-requests created by the Linux-kernel to support the execution of ICMP-requests, but not pure and standalone ARP-requests.

Therefore, it is still unclear

  • whether routes have an impact on pure standalone ARP-requests, i.e. ARP-requests not caused by ICMP-requests (or by other requests of higher layer protocols),
  • whether routes do have an impact on ARP-replies.

The experiments below will deliver detailed information to clarify these points.

When I speak of ARP-tables below, I am referring to the ARP-caches of the various network namespaces in our scenario.

Continue reading