Linux bridges – can iptables be used against MiM attacks based on ARP spoofing ? – I

This post and two following ones are about some simple iptables exercises concerning Linux virtual bridges. Linux bridges are typically used in virtualization environments. However, guest systems or even the host attached to a Linux bridge may become targets of “man in the middle” attacks. During such attacks the guests and the bridge may be manipulated to send packets to the “man in the middle” system and not directly to the intended communication partners. My objective is to get a clearer picture of iptables’ contributions to defense measures against such attacks.

Some “Howtos” on the Internet warn explicitly against using iptables at all on Linux bridges – especially not with active connection tracking. An example is the “libvirt wiki”: http://wiki.libvirt.org/page/Networking. Some of the warnings refer to an original discussion published here: http://patchwork.ozlabs.org/ patch/ 29319/
See also: https://bugzilla.redhat.com/ show_bug.cgi?id=512206

I think these concerns justify a closer look at iptables rules with respect to bridge ports. Comments are welcome.

Scenarios, limitations and objectives

In our test case we work with a KVM host with one bridge and later on also with two linked Linux bridges. In this first article we use one of the Linux guests on one of the bridges to initiate a “man in the middle attack” [MiM] against other guests of the very same bridge. The attacks are based on ARP spoofing and packet redirection. We then define some reasonable iptables rules with the intention to block the redirected traffic (to the MiM) and analyze the impact of these rules.

In a 2nd and 3rd article we extend the game to 2 bridges and the host attached to a port of one of the bridges.

Limitations and restrictions
It is obvious that we cannot prevent ARP-spoofing itself with iptables. iptables works on network layers 3/4, but not on layer 2 (Ethernet). iptables, therefore, does not allow for direct restrictions regarding the ARP protocol. So, the prevention of ARP packets with false MAC addresses, which typically initiate a MiM attack is not the objective of this article. It requires ebtables and/or arptables to block ARP spoofing at its roots. So, do not misunderstand me:

I do not and would not recommend to base any packet filter security across a Linux bridge on iptables alone. If you must use netfilter on bridges always combine iptables with basic ebtables/arptables rules – and test thoroughly against different kinds of attacks which try to break guest isolation. Always be aware of the fact that a bridge creates a global context in which packets must be inspected and followed precisely in their changing role as outgoing or incoming with respect to the bridge itself and its virtual interface ports. Global connection tracking on the TCP/IP level may be dangerous. If you give the bridge itself an IP – a situation which I do not at all like from a security perspective – take extra care. Things get even more complicated with multiple bridges on one and the same host.

Objectives
Nevertheless I think that one can learn something even from academic and unusual test configurations with iptables alone in place:

If we cannot prevent ARP spoofing itself by iptables – can we at least use iptables rules to deal with some consequences of ARP spoofing? More precisely:
Can we block the redirection of packets between ARP poisoned guests over the MiM system by means of IPtables alone? What relations of IP addresses and port devices have to be defined? And would a tool like FWbuilder support us reasonably enough with this task?

If so: How would we extend IPtables rules to situations

  • where two Linux bridges are linked (by veth devices)
  • or when segregated network parts with all guests belonging to the same logical IP segment are coupled via STP and border Ethernet interfaces of a central Linux bridge?

In both cases the spoofed communication may pass border NICs of a Linux bridge.

In this first article on the topic we look at one bridge alone with three guests. In the following posts we shall consider linked bridges.

One bridge – 3 guests

Let us assume that we have 3 guests “kali3, kali4 and kali5” on a Linux bridge “virbr6”. The bridge device itself has no IP. The guest systems are attached to the bridge via standard tap device ports (vk63, vk64 and vk65, respectively). The virtual network can be created e.g. with the help of libvirt’s virt-manager. See article KVM/qemu, libvirt, virt-manager – persistent names for virtual network bridge ports of guest systems about how to set persistent names for the bridge sided end of “tap”-devices.

The corresponding Ethernet interfaces (eth0) of the guest operative systems – i.e. the guest side of the tap devices – are given the following IP addresses: 192.168.50.13 (eth0), 192.168.50.14 (eth0) and 192.168.50.15 (eth0), respectively.

bridge

How does the host see the bridge-ports?

mytux:~ # brctl showmacs virbr6
port no mac addr                is local?       ageing timer
  1     52:54:00:8e:f2:d7       yes                0.00
  2     5e:f4:32:30:f1:3a       yes                0.00
  2     aa:bf:ba:dc:52:31       no                 1.35
  3     fe:54:00:9f:5d:c1       yes                0.00
  4     fe:54:00:74:60:4a       yes                0.00
  5     fe:54:00:0f:34:4f       yes                0.00

5 ports instead of 3 ? Yeah, actually my virbr6 bridge is connected to another bridge (virbr4) by a veth pair. But we will ignore this connection most of the time ignore in this post. If you are interested in Linux bridge linking via “veth” devices see
Fun with veth devices, Linux virtual bridges, KVM, VMware – attach the host and connect bridges via veth

The veth pair explains the 2 MACs on port Nr. 2 of the bridge. A parallel look at the outcome of “ifconfig” or “ip link show” would show that port 3 actually corresponds to tap device “vk63”, port 4 corresponds to “vk64” and port 5 to “vk65”. And what about port 1? The Linux bridge itself could also work as an Ethernet device which could get an IP address on the host. We do not use this property here – nevertheless, there is an Ethernet port associated with the bridge itself.

How does the host see the (regular) IP-MAC relations so far? After pinging our 3 guests from the host we get:

mytux:~ # brctl showmacs virbr6
port no mac addr                is local?       ageing timer
  1     52:54:00:8e:f2:d7       yes                0.00
  2     5e:f4:32:30:f1:3a       yes                0.00
  2     aa:bf:ba:dc:52:31       no                 1.35
  3     fe:54:00:9f:5d:c1       yes                0.00
  4     fe:54:00:74:60:4a       yes                0.00
  5     fe:54:00:0f:34:4f       yes                0.00

mytux:~ # arp -n
Address                  HWtype  HWaddress           Flags Mask            Iface
192.168.50.15            ether   52:54:00:0f:34:4f   C                     vmh2
....
192.168.50.13            ether   52:54:00:9f:5d:c1   C                     vmh2
192.168.50.14            ether   52:54:00:74:60:4a   C                     vmh2

We recognize our tap devices attached to the bridge. [By the way: vmh2 is a device that connects the host to one of the bridges (virbr4).]

Addendum 24.02.2016: Note a small, but decisive difference in the HW/MAC addresses
The first digit pair in the Ethernet address of the port device (i.e. the bridge sided end of the tap device) has a “fe“, whereas the Ethernet device of the Linux guest (i.e. the guest sided end of the tap device) has a “52“. The rest of the digits being the same. Logically, and also from the perspective of the bridge, these are 2 different (!) devices (though incorporated in one virtual tap). From the point of view of the bridge multiple MACs or even a new bridge may be located at the Ethernet segment behind a port.

Be aware of the fact that the so called “forward database” of a bridge [FDB], which relates MACs to ports, keeps track of the relation of our Linux guest MACs to their specific ports. Whereas the port MAC (with the leading “fe”) is permanently associated with the bridge, the MAC of the guest may disappear from the FDB after a timeout period, if no packets are received at the bridge from this guest MAC address.

In addition we make the following settings:

mytux:~ # brctl setageing virbr6 30
mytux:~ # brctl setageing virbr4 30

to be sure that the bridge works in a switch like mode and not as a hub.

Note that this defines a timeout period for the bridge’s FDB – i.e. after this period “stale” entries in the FDB of the bridge may be deleted. So the bridge may no longer know at which port the deleted MAC is located – and therefore temporarily flood all ports with packets. Therefore, bridge flooding is a situation we may need to cover with iptables-rules later on.

If you want to monitor changes of the bridges’ FDB or monitor general changes over all bridge links use the following commands:

bridge monitor all

and

bridge -statistics fdb show

The continuous output of the first command will show you directly when a stale MAC entry in the FDB is deleted. If you issue the second command twice with a reasonable time period in between you may search the output for missing or new MAC entries of guests.

Note further that we did not give the bridge itself any IP address! The bridge may therefore be called “transparent”. As “virbr6” has no IP address the guests (kali3 to 5) can not directly communicate with the host through the bridge itself as an Ethernet device. Just for information: In our scenario the host can only be reached indirectly over a transparently linked second bridge (virbr4) and a further veth pair there which leads to an external Ethernet device with address 192.168.50.1.

ICMP packages and regular pinging – what do we allow?

First we shall have a look at ICMP packages, only. Our basic policy with iptables is that we deny everything that is not explicitly allowed. Regarding further rules we should be aware of the following:

When setting up iptables rules on bridges we must be precise and specific with respect to the packet direction across the involved bridge port interfaces.
Note: It is the perspective of the bridge and NOT the perspective of the guest that counts.
I always use a 3D picture to be sure: Assume the bridge and its ports to be located above the guests. Then a packet going up is incoming, a packet moving downwards is outgoing.

Let us assume you want to ping from kali3 to kali5. From the point of view of the bridge there are 2 packet directions involved: We first get an incoming ICMP (type 8) packet via bridge interface “vk63”, which then is directed (or “forwarded”) outwards through “vk65”. To allow for the pinging we would need rules of the logical form

bridge vibr6 rule : src 192.168.50.13, dest 192.168.50.15 – ICMP in via vk31, out
via vk65 => ALLOW

and analogously for the other guests and interfaces. Actually this rule may be split up into 2 subsequent rules:

bridge vibr6 rule :  src 192.168.50.13, dest 192.168.50.15 - ICMP in via vk31 => ALLOW  
bridge vibr6 rule :  src 192.168.50.13, dest 192.168.50.15 - ICMP out via vk65 => ALLOW  

which are to be considered in the basic chains of iptables. This leads us to the next question: Which of the iptables chains is relevant here?

In our example it is the FORWARD chain. For the interaction of netfilter components (ebtables/iptables) in kernels with activated netfilter see the following link: http://ebtables.netfilter.org/ br_fw_ia/ br_fw_ia.html

That we need to set up FORWARD rules is also logical as the bridge does nothing else than forwarding packets between its ports and thus transfers the packets to attached destination guests or into segregated network parts behind some of the ports (with the “spanning tree protocol” STP set to ON).

ARP spoofing and the bridge

Consider a situation in which guest “kali4” acts as a “man in the middle”, who wants to sniff or even manipulate the traffic (e.g. for “secrets”) between kali3 and kali5. A user with root rights on kali4 would use a ARP spoofing tool like “dsniff” to (arp-) poison its neighbouring guests via the following command sequence:

root@kali4: ~# echo 1 > /proc/sys/net/ipv4/ip_forward
root@kali4: ~# iptables -A OUTPUT -p icmp --icmp-type redirect -j REJECT
root@kali4: ~# arpspoof -i eth0 -t 192.168.50.13 192.168.50.15 & 2> /dev/null    
root@kali4: ~# arpspoof -i eth0 -t 192.168.50.15 192.168.50.13 & 2> /dev/null   

The first command guarantees that redirected and sniffed packets are forwarded (routed) via the MiM system (kali4) to their original targets. The second command on the MiM-system avoids sporadic ICMP “redirect” answers to the poisoned and pinging guests – such answers would/could indicate to these guests that something is wrong. The 3rd and the 4th command eventually poison the internal ARP caching tables of the guests. I.e., these commands spoil the cached information on IP-MAC relations after some time.

Let us look at kali3 – before the attack:

kali3_arp

And during the attack:

kali3_arp2

In a previous post of this blog we saw that a Linux bridge learns about the relation of MAC addresses and bridge ports – and thus pins a specific communication down to just the 2 involved ports of a specific communication (basic guest isolation). The bridge normally does not spread communication packets over all ports (at least with a setageing parameter > 0).

Note that this does not help to prevent MiM attacks. As the bridge itself works on layer 2 it ignores IP-MAC relations during packet forwarding. (It may learn about IP-MAC relations only through the ARP protocol.) The bridge furthermore does not know whether routing may occur somewhere. And the guests themselves cannot ignore that situations where several IP addresses may be associated with one and the same MAC are possible. Because of all these reasons Ethernet packets are inevitably sent and forwarded across the bridge were the guests think they should be sent to – according to their own internal ARP tables, which are poisoned during the attack.

Therefore after ARP spoofing the bridge would receive 2 subsequent ping request packets from kali3 and from kali4 with the logical route

src 192.168.50.13, dest 192.168.50.15 - ICMP ping request in via vk63, out via vk64   
src 192.168.50.13, dest 192.168.50.15 - ICMP ping request in via vk64, out via vk65   

And the ping answers back via

src 192.168.50.15, dest 192.168.50.13 - ICMP ping answer in via vk65, out via vk64  
src 192.168.50.15, dest 192.168.50.13 - ICMP ping answer in via vk64, out via vk63  

A small side aspect: I should mention that despite the switch-like operational mode of the Linux bridge, I sometimes – very rarely – saw that even the KVM host reacted towards the ARP poisoning and showed some wrong entries in its internal ARP cache table – some time after the attack started. I have not clarified, yet, what the reason for this change of the hosts ARP table actually is. If some reader knows the reason please write me a mail. I suspect gratuitous packets, or (more likely) some rare hub like flooding situation on the bridge, but …

E.g. after a restart of all virtual machines, the begin of the ARP poisoning and after pinging the host continuously from the MiM system for a while, you may eventually find the following ARP table change on the KVM host:

mytux:~ # arp
Address                  HWtype  HWaddress           Flags Mask            Iface   
192.168.50.15            ether   52:54:00:0f:34:4f   C                     vmh2
192.168.50.13            ether   52:54:00:9f:5d:c1   C                     vmh2
192.168.50.14            ether   52:54:00:74:60:4a   C                     vmh2
mytux::~ # arp
Address                  HWtype  HWaddress           Flags Mask            Iface  
192.168.50.15            ether   52:54:00:74:60:4a   C                     vmh2
192.168.50.13            ether   52:54:00:74:60:4a   C                     vmh2
192.168.50.14            ether   52:54:00:74:60:4a   C                     vmh2

And the port-MAC-association? It remains as it was:

mytux:~ # brctl showmacs virbr6
port no mac addr                is local?       ageing timer   
  ...
  5     fe:54:00:0f:34:4f       yes                0.00
  4     fe:54:00:74:60:4a       yes                0.00
  3     fe:54:00:9f:5d:c1       yes                0.00

Be aware, however, of the fact that this information tells us nothing about the present state of the FDB table of the bridge! Actually, due to our “setageing” parameter certain MAC addresses of guests may drop out of the forward list of the bridge, if the guests are inactive with respect to network communication, and this in turn may result in a subsequent (temporary) bridge port flooding.

So, if you stop the ARP poisoning, reset the ARP tables and start the spoofing again, an ARP poisoning of the host itself it may not happen directly. It may, however, happen after some time. (By the way: Any direct pinging from the host to the guests will correct the ARP table to the real values again – at least for some time.)

Anyway and whatever the precise reason – it is interesting that there obviously are circumstances under which the local poisoning of bridge guests may impact even the ARP table on the bridge’s host itself. On the defense side this may give us a secondary chance (besides monitoring the violation of iptables and ebtables rules) to detect ARP spoofing attacks: by monitoring the host’s internal ARP table and analyzing its contents for implausible changes.

iptables rules to prevent misguided packets

To avoid part of the redirected packet transport across the Linux bridge we would require a rule of the logical form

bridge vibr6 rule :  src any, dest 192.168.50.15 - in any, out via vk64 => DENY  

We can reformulate the rule with a negation (!) in a more general way:

bridge vibr6 rule :  src any, !dest 192.168.50.14 - in any, out via vk64 => DENY  

In addition it is reasonable to forbid packets which (seem to) come from kali3 and are “outbound” to kali3:

bridge vibr6 rule :  src 192.168.50.13, dest any - in any, out via vk63 => DENY  

Also incoming packets via vk63 from sources not being kali3 make no sense:

bridge vibr6 rule :  ! src 192.168.50.13, dest any - in vk63, out any => DENY  

Actually, on our bridge we would have to cover analogous variants of all of the above DENY rules for all other guest ports and protocols.

Note that all these rules define fixed relations between each of the defined bridge ports, an associated IP and certain packet directions across the port: with iptables alone we are restricted to such types of relations.

Graphical help – FWbuilder

A problem with the relations above is that they are potentially many – depending at least quadratically on the number of guests on a bridge. An efficient administration requires either a tool or good scripting experience or both. A tool like FWbuilder at least supports us graphically:

fwb_4

The rules created for the shown conditions look like:

    # 
    # Rule 2 (vk63)
    # 
    echo "Rule 2 (vk63)"
    # 
    $IPTABLES -N Out_RULE_2
    $IPTABLES -A FORWARD -m physdev --physdev-is-bridged --physdev-out vk63 !  -d 192.168.50.13   -j Out_RULE_2    
    $IPTABLES -A Out_RULE_2  -j LOG  --log-level info --log-prefix "RULE 2 -- DENY "
    $IPTABLES -A Out_RULE_2  -j DROP
    # 
    # Rule 3 (vk64)
    # 
    echo "Rule 3 (vk64)"
    # 
    $IPTABLES -N Out_RULE_3
    $IPTABLES -A FORWARD -m physdev --physdev-is-bridged --physdev-out vk64 !  -d 192.168.50.14   -j Out_RULE_3    
    $IPTABLES -A Out_RULE_3  -j LOG  --log-level info --log-prefix "RULE 3 -- DENY "
    $IPTABLES -A Out_RULE_3  -j DROP
    # 
    # Rule 4 (vk65)
    # 
    echo "Rule 4 (vk65)"
    # 
    $IPTABLES -N Out_RULE_4
    $IPTABLES -A FORWARD -m physdev --physdev-is-bridged --physdev-out vk65 !  -d 192.168.50.15   -j Out_RULE_4
    $IPTABLES -A Out_RULE_4  -j LOG  --log-level info --log-prefix "RULE 4 -- DENY "
    $IPTABLES -A Out_RULE_4  -j DROP
    # 
    .....
    .....
    # Rule 6 (vk63)
    # 
    echo "Rule 6 (vk63)"
    # 
    $IPTABLES -N Out_RULE_6
    $IPTABLES -A FORWARD -m physdev --physdev-is-bridged --physdev-out vk63  -s 192.168.50.13   -j Out_RULE_6
    $IPTABLES -A Out_RULE_6  -j LOG  --log-level info --log-prefix "RULE 6 -- DENY "
    $IPTABLES -A Out_RULE_6  -j DROP
    # 
    # Rule 7 (vk64)
    # 
    echo "Rule 7 (vk64)"
    # 
    $IPTABLES -N Out_RULE_7
    $IPTABLES -A FORWARD -m physdev --physdev-is-bridged --physdev-out vk64  -s 192.168.50.14   -j Out_RULE_7
    $IPTABLES -A Out_RULE_7  -j LOG  --log-level info --log-prefix "RULE 7 -- DENY "
    $IPTABLES -A Out_RULE_7  -j DROP
    # 
    # Rule 8 (vk65)
    # 
    echo "Rule 8 (vk65)"
    # 
    $IPTABLES -N Out_RULE_8
    $IPTABLES -A FORWARD -m physdev --physdev-is-bridged --physdev-out vk65  -s 192.168.50.15   -j Out_RULE_8
    $IPTABLES -A Out_RULE_8  -j LOG  --log-level info --log-prefix "RULE 8 -- DENY "
    $IPTABLES -A Out_RULE_8  -j DROP
    # 
    ....
    ....
    # 
    # Rule 11 (vk63)
    # 
    echo "Rule 11 (vk63)"
    # 
    $IPTABLES -N In_RULE_11
    $IPTABLES -A INPUT -m physdev --physdev-in vk63 !  -s 192.168.50.13   -j In_RULE_11
    $IPTABLES -A FORWARD -m physdev --physdev-in vk63 !  -s 192.168.50.13   -j In_RULE_11
    $IPTABLES -A In_RULE_11  -j LOG  --log-level info --log-prefix "RULE 11 -- DENY "
    $IPTABLES -A In_RULE_11  -j DROP
    # 
    # Rule 12 (vk64)
    # 
    echo "Rule 12 (vk64)"
    # 
    $IPTABLES -N In_RULE_12
    $IPTABLES -A INPUT -m physdev --physdev-in vk64 !  -s 192.168.50.14   -j In_RULE_12
    $IPTABLES -A FORWARD -m physdev --physdev-in vk64 !  -s 192.168.50.14   -j In_RULE_12
    $IPTABLES -A In_RULE_12  -j LOG  --log-level info --log-prefix "RULE 12 -- DENY "
    $IPTABLES -A In_RULE_12  -j DROP
    # 
    # Rule 13 (vk65)
    # 
    echo "Rule 13 (vk65)"
    # 
    $IPTABLES -N In_RULE_13
    $IPTABLES -A INPUT -m physdev --physdev-in vk65 !  -s 192.168.50.15   -j In_RULE_13
    $IPTABLES -A FORWARD -m physdev --physdev-in vk65 !  -s 192.168.50.15   -j In_RULE_13
    $IPTABLES -A In_RULE_13  -j LOG  --log-level info --log-prefix "RULE 13 -- DENY "
    $IPTABLES -A In_RULE_13  -j DROP

Ignoring some optimization potential, this is actually what we need. You see the clue:
FWbuilder knows about the bridge situation (see below) and creates rules with options

-m physdev –physdev-in/out device

The documentation from http://www.fwbuilder.org/ 4.0/docs/ users_guide5/ host-interface.shtml says accordingly:

Bridge port: This option is used for a port of a bridged firewall. The compilers skip bridge ports when they pick interfaces to attach policy and NAT rules to. For target firewall platforms that support bridging and require special configuration parameters to match bridged packets, compilers use this attribute to generate a proper configuration. For example, in case of iptables, the compiler uses -m physdev –physdev-in or -m physdev –physdev-out for bridge port interfaces. (This object applies to firewall objects only.)

It requires, however, a special configuration of FWbuilder with respect to the defined interfaces and the bridges on the firewall system – i.e. the virtualization host in our test situation:

fwb_4

The same of course for bridge “virbr6”.

Note that our rules (produced by FWbuilder above) for the bridge ports vk63, vk64, vk65 would also work in case of a port flooding situation – if they are not circumvented by other leading rules. The latter being a point we shall come back to.

What packets do we allow?

On a firewall with a basic drop policies we need, of course, to define acceptance conditions for packets, too. Without going into details we need logical rules like:

bridge vibr6 rule :  src 192.168.50.13, dest 192.168.50.15, 192.168.50.14, any ICMP - in via vk31   => ALLOW  

An example is shown here:

fwb_5

# Rule 21 (vk63)
    # 
    echo "Rule 21 (vk63)"
    # 
    $IPTABLES -N In_RULE_21
    $IPTABLES -A FORWARD -m physdev --physdev-in vk63 -p icmp  -m icmp  -s 192.168.50.13   -d 192.168.50.1   --icmp-type any  -m state --state NEW  -j In_RULE_21   
    $IPTABLES -A INPUT -m physdev --physdev-in vk63 -p icmp  -m icmp  -s 192.168.50.13   -d 192.168.50.1   --icmp-type any  -m state --state NEW  -j In_RULE_21   
    $IPTABLES -N Cid8093X19506.0
    $IPTABLES -A FORWARD -m physdev --physdev-in vk63 -p icmp  -m icmp  -s 192.168.50.13   --icmp-type any  -m state --state NEW  -j Cid8093X19506.0
    $IPTABLES -A Cid8093X19506.0  -d 192.168.50.12   -j In_RULE_21
    $IPTABLES -A Cid8093X19506.0  -d 192.168.50.14   -j In_RULE_21
    $IPTABLES -A Cid8093X19506.0  -d 192.168.50.15   -j In_RULE_21
    $IPTABLES -A In_RULE_21  -j LOG  --log-level info --log-prefix "RULE 21 -- ACCEPT "
    $IPTABLES -A In_RULE_21  -j ACCEPT

We need of course all variants for all the other bridge interfaces. To make life simpler you could define groups of recipients in a tool like FWbuilder.

Order of our rules

We eventually come to a trivial but important point: In which order must we arrange the discussed iptables DENY and ACCEPT commands? A little thinking shows:

We need the “DENY”-rules first before we allow anything else – i.e. we need the basic DENY rules discussed above as the leading rules in all affected chains!

If a packet is first allowed – e.g. due to some reasonable IN rule – then it definitely is allowed. To be on the safe side we, therefore, must probe the critical FORWARD rules for unacceptable outgoing and incoming packets over certain bridge ports, first.

A really critical aspect in the context is a potentially applied overall acceptance of packets for established connections (connection tracking). For most stateful inspection packet filters the general acceptance of incoming packets for established connections is a default.

E.g., in FWbuilder you have to turn this policy off explicitly, if you do not want to have it. Otherwise, FWbuilder will create general acceptance rules for all 3 basic chains ahead of all other rules:

    # ================ Table 'filter', automatic rules
    # accept established sessions
    $IPTABLES -A INPUT   -m state --state ESTABLISHED,RELATED -j ACCEPT   
    $IPTABLES -A OUTPUT  -m state --state ESTABLISHED,RELATED -j ACCEPT 
    $IPTABLES -A FORWARD -m state --state ESTABLISHED,RELATED -j ACCEPT   

Note, that these rules would cover ALL bridges and ALL related interfaces/ports on a virtualization host (global context of acceptance)! This makes such leading rules potentially dangerous on hosts with bridges! Both during ARP spoofing attacks, but also in port flooding situations – as the ports work in a promiscuous mode. Be aware of the fact that the attack pattern discussed above could in principle be extended to guests on other bridges on the host, if the attacker knew the relevant IP addresses.

On the other side acceptance rules for established connections actually can really be convenient. My conclusion is: Either you use a set of very general iptables rules that require no connection tracking on the bridge at all – and then your guest systems must establish their own firewalls. Or :

Whatever your FW-Tool generates: Edit the resulting script and move the acceptance rule for established connections after/below the set of critical “DENY” rules on the bridge interfaces discussed above. Check in addition that the DENY rules themselves really are set as stateless rules.

Testing

Let us say kali3 pings kali5 after ARP poisoning. What can the MiM on “kali4” really see then – if no firewall rules are implemented on the host? As expected all and everything:

kali4_wshark_1

You see the poisoning packets and the redirected (duplicated) messages between kali3 and kali5. The same would of course be true for any kind of real TCP/IP communication. So without any measures the MiM can follow all communication after spoofing.

Now let us implement the iptables rules discussed above. In our test case we expect our “rule 3” to block the redirected (misguided) traffic to kali4. And really:

kali3_ping1

And at the same time on the host:

mytux:~/bin # tail -f /var/log/firewall
...
...
2016-02-23T14:41:17.783163+01:00 mytux kernel: [33572.296587] RULE 3 -- DENY IN=virbr6 OUT=virbr6 PHYSIN=vk63 PHYSOUT=vk64 MAC=52:54:00:74:60:4a:52:54:00:9f:5d:c1:08:00 SRC=192.168.50.13 DST=192.168.50.15 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=837 DF PROTO=ICMP TYPE=8 CODE=0 ID=2401 SEQ=1    
2016-02-23T14:41:18.790152+01:00 mytux kernel: [33573.304717] RULE 3 -- DENY IN=virbr6 OUT=virbr6 PHYSIN=vk63 PHYSOUT=vk64 MAC=52:54:00:74:60:4a:52:54:00:9f:5d:c1:08:00 SRC=192.168.50.13 DST=192.168.50.15 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=869 DF PROTO=ICMP TYPE=8 CODE=0 ID=2401 SEQ=2   
2016-02-23T14:41:19.798127+01:00 mytux kernel: [33574.313685] RULE 3 -- DENY IN=virbr6 OUT=virbr6 PHYSIN=vk63 PHYSOUT=vk64 MAC=52:54:00:74:60:4a:52:54:00:9f:5d:c1:08:00 SRC=192.168.50.13 DST=192.168.50.15 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=977 DF PROTO=ICMP TYPE=8 CODE=0 ID=2401 SEQ=3    
 

Good! Defense is obviously possible – even on the IP-level – as soon as we relate bridge ports to IP information!

Stopping ARP spoofing – with potential port flooding on the bridge as an aftermath

At some point in time the MiM attacker may stop his spoofing by

root@kali4:~# killall arpspoof
root@kali4:~# echo 1 > /proc/sys/net/ipv4/ip_forward

Before the poisoning jobs terminate themselves they send some packets which try to correct the corrupted ARP information on the attacked guests. However, depending on the load of the guests and the host this correction may go wrong – on one or both poisoned guests – and the old spoofed information may remain in the guests’ ARP tables:

kali3_ping2

And even some seconds later:

kali3_ping3

kali3 still thinks that 192.168.50.15 is located at the MAC address of kali4! How long this wrong information is kept depends on the relevant timeout parameter for local ARP table cache entries – see the output of:

$ cd /proc/sys/net/ipv4/neigh/
$ cat default/gc_stale_time

For our Debian guests this parameter typically has a value of 60 secs.

The picture above shows that on kali3 first 9 ICMP request packets were sent which got no answer. Later on a second series of pinging requests work normally again. In this specific test case – with a remaining wrong ARP information on kali3 – actually 2 interesting things happened in parallel:

mytux:~/bin # tail -f /var/log/firewall
2016-02-23T16:18:32.972806+01:00 mytux kernel: [ 1909.777744] RULE 21 -- ACCEPT IN=virbr6 OUT=virbr6 PHYSIN=vk63 PHYSOUT=vk65 MAC=52:54:00:74:60:4a:52:54:00:9f:5d:c1:08:00 SRC=192.168.50.13 DST=192.168.50.15 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=19648 DF PROTO=ICMP TYPE=8 CODE=0 ID=1373 SEQ=1 
2016-02-23T16:18:32.972820+01:00 mytux kernel: [ 1909.777774] RULE 3 -- DENY IN=virbr6 OUT=virbr6 PHYSIN=vk63 PHYSOUT=vk64 MAC=52:54:00:74:60:4a:52:54:00:9f:5d:c1:08:00 SRC=192.168.50.13 DST=192.168.50.15 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=19648 DF PROTO=ICMP TYPE=8 CODE=0 ID=1373 SEQ=1 
2016-02-23T16:18:32.972822+01:00 mytux kernel: [ 1909.777785] RULE 21 -- ACCEPT IN=virbr6 OUT=virbr6 PHYSIN=vk63 PHYSOUT=vethb2 MAC=52:54:00:74:60:4a:52:54:00:9f:5d:c1:08:00 SRC=192.168.50.13 DST=192.168.50.15 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=19648 DF PROTO=ICMP TYPE=8 CODE=0 ID=1373 SEQ=1     
2016-02-23T16:18:32.972823+01:00 mytux kernel: [ 1909.777806] RULE 5 -- DENY IN=virbr4 OUT=virbr4 PHYSIN=vethb1 PHYSOUT=vnet0 MAC=52:54:00:74:60:4a:52:54:00:9f:5d:c1:08:00 SRC=192.168.50.13 DST=192.168.50.15 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=19648 DF PROTO=ICMP TYPE=8 CODE=0 ID=1373 SEQ=1 
2016-02-23T16:18:32.972824+01:00 mytux kernel: [ 1909.777818] RULE 35 -- DENY IN=virbr4 OUT=virbr4 PHYSIN=vethb1 PHYSOUT=vmw1 MAC=52:54:00:74:60:4a:52:54:00:9f:5d:c1:08:00 SRC=192.168.50.13 DST=192.168.50.15 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=19648 DF PROTO=ICMP TYPE=8 CODE=0 ID=1373 SEQ=1 
2016-02-23T16:18:32.972825+01:00 mytux kernel: [ 1909.777827] RULE 35 -- DENY IN=virbr4 OUT=virbr4 PHYSIN=vethb1 PHYSOUT=vmh1 MAC=52:54:00:74:60:4a:52:54:00:9f:5d:c1:08:00 SRC=192.168.50.13 DST=192.168.50.15 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=19648 DF PROTO=ICMP TYPE=8 CODE=0 ID=1373 SEQ=1 
....
.....
2016-02-23T16:18:40.972821+01:00 mytux kernel: [ 1917.786358] RULE 21 -- ACCEPT IN=virbr6 OUT=virbr6 PHYSIN=vk63 PHYSOUT=vk65 MAC=52:54:00:74:60:4a:52:54:00:9f:5d:c1:08:00 SRC=192.168.50.13 DST=192.168.50.15 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=20666 DF PROTO=ICMP TYPE=8 CODE=0 ID=1373 SEQ=9 
2016-02-23T16:18:40.972847+01:00 mytux kernel: [ 1917.786378] RULE 3 -- DENY IN=virbr6 OUT=virbr6 PHYSIN=vk63 PHYSOUT=vk64 MAC=52:54:00:74:60:4a:52:54:00:9f:5d:c1:08:00 SRC=192.168.50.13 DST=192.168.50.15 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=20666 DF PROTO=ICMP TYPE=8 CODE=0 ID=1373 SEQ=9 
2016-02-23T16:18:40.972850+01:00 mytux kernel: [ 1917.786385] RULE 21 -- ACCEPT IN=virbr6 OUT=virbr6 PHYSIN=vk63 PHYSOUT=vethb2 MAC=52:54:00:74:60:4a:52:54:00:9f:5d:c1:08:00 SRC=192.168.50.13 DST=192.168.50.15 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=20666 DF PROTO=ICMP TYPE=8 CODE=0 ID=1373 SEQ=9      
2016-02-23T16:18:40.972852+01:00 mytux kernel: [ 1917.786397] RULE 5 -- DENY IN=virbr4 OUT=virbr4 PHYSIN=vethb1 PHYSOUT=vnet0 MAC=52:54:00:74:60:4a:52:54:00:9f:5d:c1:08:00 SRC=192.168.50.13 DST=192.168.50.15 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=20666 DF PROTO=ICMP TYPE=8 CODE=0 ID=1373 SEQ=9 
2016-02-23T16:18:40.972854+01:00 mytux kernel: [ 1917.786404] RULE 35 -- DENY IN=virbr4 OUT=virbr4 PHYSIN=vethb1 PHYSOUT=vmw1 MAC=52:54:00:74:60:4a:52:54:00:9f:5d:c1:08:00 SRC=192.168.50.13 DST=192.168.50.15 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=20666 DF PROTO=ICMP TYPE=8 CODE=0 ID=1373 SEQ=9 
2016-02-23T16:18:40.972856+01:00 mytux kernel: [ 1917.786410] RULE 35 -- DENY IN=virbr4 OUT=virbr4 PHYSIN=vethb1 PHYSOUT=vmh1 MAC=52:54:00:74:60:4a:52:54:00:9f:5d:c1:08:00 SRC=192.168.50.13 DST=192.168.50.15 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=20666 DF PROTO=ICMP TYPE=8 CODE=0 ID=1373 SEQ=9 
.....
2016-02-23T16:19:07.924844+01:00 mytux kernel: [ 1944.768264] RULE 21 -- ACCEPT IN=virbr6 OUT=virbr6 PHYSIN=vk63 PHYSOUT=vk65 MAC=52:54:00:0f:34:4f:52:54:00:9f:5d:c1:08:00 SRC=192.168.50.13 DST=192.168.50.15 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=25179 DF PROTO=ICMP TYPE=8 CODE=0 ID=1376 SEQ=1 
  

Where do the reactions at other ports than vk64 come from? The first part of the explanation is that the bridge temporarily flooded all its ports (vk64, vk65, vethb1) with the ping requests of kali3! This in turn lead to local denial reactions on virbr6 and also on our second bridge (vribr4). For the reason of the flooding see below.

The second strange thing is that during each of the nine ping trials a successful packet submission occurs through port vk65 – but there is no log entry for an answer packet. Why is this?

Port flooding means copying of packets for the submission over all bridge ports other than the port of the incoming packet. The, in our case, wrong destination MAC addresses of the packets included. The bridge “hopes” for an answer of the addressed MAC at one of the ports. But is this going to happen in our test situation – in which kali3 sends requests out still to the wrong MAC of kali 4?

No – because despite flooding and acceptance for transport over port vk65, kali5 rightfully ignores the copied packets due to their wrong destination MAC. On the other side kali4 will not receive anything due to the iptables rules and cannot react either. So, we end up in a situation where ICMP request packets are sent by kali3 – but no answer will return from any bridge port. This in turn leads to the fact that the bridge is not learning what it needs to learn to stop the flooding. This situation will at least remain until the ARP cache table on kali3 is corrected/updated.

So only with a subsequent new ping series – and after the ARP table of kali3 has been updated – everything will work again as expected.

Addendum, 24.02.2016:

Some reader asked me via mail to explain why flooding occurred at all. This is a good question – and I have therefore added relevant remarks into the text above. Due to our limited “setageing” parameter some “guest MAC – port” relation may be deleted from the bridges forward database (FDB) after some time. (In addition we may have impacts of the STP protocol.) With our setaging parameter and the active iptables rules kali5 will drop out of the FDB pretty soon (after 30 secs): the original pinging during the attack situation will not reach kali5, and kali5 otherwise remains passive. However, also kali4 drops out 30 secs after stopping the spoofing attack from the FDB. So, we may reach a situation where kali3 still has the wrong ARP information, but kali4’s MAC is no longer in the FDB. We ended up in a kind of race condition between timeouts of the bridge’s FDB and ARP table cache renewal on the guests.

Due to the fact that either of the spoofed guests may still have wrong ARP information after the spoofing was stopped by the attacker various strange situations may occur. kali3 may have the right ARP information, but kali5 not yet. Then answering packets may be created which try to reach kali4 instead of kali3. Such packets must not be allowed by any acceptance rules (including established relation rules) – hence again: we need the DENY rules first.

What have we learned from all this?

  1. The stop of the ARP spoofing can leave the bridge and some of the guests in a unconsolidated mode for some time – despite a few final packets from the attacker system to restore ARP information on the attacked systems. One or two of the attacked guests and the host may still keep wrong entries in its/their ARP table(s). The duration of such a situation depends on the local timeout parameter for the ARP caching table entries on the guest systems.
  2. With a limited “setageing” parameter of the bridge, port flooding is not improbable during a period after the end of a ARP spoofing attack. As a consequence, the firewall rules must prevent the consequences of port flooding, too. Therefore, we need to take care not only of guest ports, but also of border ports which lead to segregated parts of the net or to other bridges.
  3. In the course of port flooding, response packets may be directed to wrong recipients. This status will remain until the ARP tables of the guests are updated. During such phase the defined DENY rules must be probed first before any kind of acceptance rules.
  4. Regarding the competition of the different timeouts on ARP caching tables and bridge FDBs: A conclusion in case of relatively stable guest-port relations might be to set the FDB timeout (setaging parameter of the bridge) to reasonably large values (in the range of a few minutes) to avoid flooding situations. On the other side the timeout for local ARP caching could be reduced as long as this does not create unreasonable ARP traffic.

What about TCP/IP packets?

If we think a bit about the general rules discussed above, we may understand that they would work also for standard TCP/IP packets of general TCP protocols. Actually, we have defined the leading denial rules for wrong “IP/port/direction”-associations without any reference to a specific protocol. So, our rules should hold in the general case, too. The reader may test this by configuring one of the guests as a web server or by using “netcat” to set up a simple server on one of the bridge guests.

We shall investigate a related full TCP scenario in one of the next posts – where we shall follow packets across 2 bridges and to the host. So, be patient, if you do not want to perform experiments, yourself.

Summary

Obviously netfilter iptables rules can not prevent ARP spoofing and resulting “man in the middle attack” trials on virtual guest systems attached to Linux bridges of a virtualization host. However, properly designed iptables rules can intercept and interrupt the redirected traffic which a MiM system attached to the bridge wants to provoke.

Appropriate iptables rules testing predefined IP-port relations on bridges may therefore supplement and accompany additional measures on the ebtables/arptables level of netfilter. However, such rules should not be undermined by leading acceptance rules related to connection tracking.

Even an already stopped ARP spoofing attack may leave the bridge and its guests in an unconsolidated status for a while. In addition flooding of packets to all bridge ports may occur. Appropriate denial rules for guest ports and Ethernet border ports in STP situations must block the resulting improper traffic. The reduction of flooding situations may require an adaption of the “setageing” parameter to reasonably large values for predictably stable configurations of guests on a bridge.

Most important: General acceptance rules for established connections should only be applied in the sequence of firewall rules AFTER all critical (denial) rules regarding unacceptable traffic across certain ports have been tested for incoming/outgoing packets. This may require explicit changes of the scripts created by Firewall tools like FWbuilder.

A significant problem is the requirement that the association of IP addresses and ports must be known or determined at the time of the definition and/or application of the filter rules. This requires persistent port naming techniques and under certain circumstances also persistent MAC distribution techniques plus DHCP restrictions for the guests within the used virtualization environment.

In the next post of this series
Linux bridges – can iptables be used against MiM attacks based on ARP spoofing ? – II
we discuss how we can extend our rules to scenarios with multiple bridges on one host – and discover that we need a special treatment of packets crossing bridge borders.

 

Fun with veth devices, Linux virtual bridges, KVM, VMware – attach the host and connect bridges via veth

Typically, virtual “veth” Ethernet devices are used for connecting virtual containers (as LXC) to virtual bridges like an OpenVswitch. But, due to their pair nature, veth” devices promise flexibility also in other, much simpler contexts of virtual network construction. Therefore, the objective of this article is to experiment a bit with “veth” devices as tools to attach the virtualization host itself or other (virtual) devices like a secondary Linux bridge or a VMware bridge to a standard Linux bridge – and thus enable communication with and between virtualized guest systems.

Motivation

I got interested in “veth”-devices when trying to gain flexibility for quickly rebuilding and rearranging different virtual network configurations in a pen-testing lab on Linux laptops. For example:

  • Sometimes you strongly wish to avoid giving a Linux bridge itself an IP. Assigning an IP to a Linux bridge normally enables host communication with KVM guests attached to the bridge. However, during attack simulations across the bridge the host gets very exposed. In my opinion the host can better and more efficiently be protected by packet filters if it communicates with the bridge guests over a special “veth” interface pair which is attached to the bridge. In other test or simulation scenarios one may rather wish to connect the host like an external physical system to the bridge – i.e. via a kind of uplink port.
  • There are scenarios for which you would like to couple two bridges, each with virtual guests, to each other – and make all guests communicate with each other and the host. Or establish communication from a guest of one Linux bridge to VMware guests of a VMware bridge attached to yet another Linux bridge. In all these situations all guests and the host itself may reside in the same logical IP network segment, but in segregated parts. In the physical reality admins may have used such a segregation for improving performance and avoiding an overload of switches.
  • In addition one can solve some problems with “veth” pairs which otherwise would get complicated. One example is avoiding the assignment of an IP address to a special enslaved ethernet device representing the bridge for the Linux system. Both libvirt’s virt-manager and VMware WS’s “network editor” automatically perform such an IP assignment when creating virtual host-only-networks. We shall come back to this point below.

As a preparation let us first briefly compare “veth” with “tap” devices and summarize some basic aspects of Linux bridges – all according to my yet limited understanding. Afterwards, we shall realize a simple network scenario as for training purposes.

vtap vs veth

A virtual “tap” device is a single point to point device which can be used by a program in user-space or a virtual machine to send Ethernet packets on layer 2 directly to the kernel or receive packets from it. A file descriptor (fd) is read/written during such a transmission. KVM/qemu virtualization uses “tap” devices to equip virtualized guest system with a virtual and configurable ethernet interface – which then interacts with the fd. A tap device can on the other side be attached to a virtual Linux bridge; the kernel handles the packet transfer as if it occurred over a virtual bridge port.

“veth” devices are instead created as pairs of connected virtual Ethernet interfaces. These 2 devices can be imagined as being connected by a network cable; each veth-device of a pair can be attached to different virtual entities as OpenVswitch bridges, LXC containers or Linux standard bridges. veth pairs are ideal to connect virtual devices to each other.

While not supporting veth directly, a KVM guest can bridge a veth device via
macVtap/macVlan (see https://seravo.fi/2012/virtualized-bridged-networking-with-macvtap.

In addition, VMware’s virtual networks can be bridged to a veth device – as we shall show below.

Aspects and properties of Linux bridges

Several basic aspects and limitations of standard Linux bridges are noteworthy:

  • A “tap” device attached to one Linux bridge cannot be attached to another Linux bridge.
  • All attached devices are switched into the promiscuous mode.
  • The bridge itself (not a tap device at a port!) can get an IP address and may work as a standard Ethernet device. The host can communicate via this address with other guests attached to the bridge.
  • You may attach several physical Ethernet devices (without IP !) of the host to a bridge – each as a kind of “uplink” to other physical switches/hubs and connected systems. With the spanning tree protocol activated all physical systems attached to the network behind each physical interface may communicate with physical or virtual guests linked to the bridge by other physical interfaces or virtual ports.
  • Properly configured the bridge transfers packets directly between two specific bridge ports related to the communication stream of 2 attached guests – without exposing the communication to other ports and other guests. The bridge may learn and update the relevant association of MAC addresses to bridge ports.
  • The virtual bridge device itself – in its role as an Ethernet device – does not work in promiscuous mode. However, packets arriving through one of its ports for (yet) unknown addresses may be flooded to all ports.
  • You cannot bridge a Linux bridge directly by or with another Linux bridge (no Linux bridge cascading). You can neither connect a Linux bride to another Linux bridge via a “tap” device.

In combination with VMware (on a Linux host) some additional aspects are interesting:

  • A virtual Linux bridge in its role as an Ethernet device can be bridged by non-native Linux bridges – e.g. by VMware bridges – and thereby be switched into promiscuous mode. The VMware (master) bridge then uses a Linux bridge as an attached (slave) device. This type of bridge cascading may have security impacts: packets arriving via a physical port at the Linux bridge and being destined to VMware guests connected to their VMware master bridge may become visible at the Linux bridge ports. See:
    VMware WS – bridging of Linux bridges and security implications
  • The “vmnet”-Ethernet device related to a VMware bridge on a Linux host can be attached (without an IP-address) to a Linux bridge thus enabling communication between VMware guests attached to a VMware bridge and KVM guests connected to the Linux bridge. However, as this is an uplink like situation we must get rid of any IP address assigned to the “vmnet”-Ethernet device.
  • A test scenario

    I want to realize the following test scenario with the help of veth-pairs:

    Our virtual network shall contain two coupled Linux bridges, each with a KVM guest. The host “mytux” shall be attached via a regular bridge port to only one of the bridges. In addition we want to connect a VMware bridge to one of the Linux bridges. All KVM/VMware guests shall belong to the same logical layer 3 network segment and be able to communicate with each other and the host (plus external systems via routing).

    veth6

    The RJ45 like connectors in the picture above represent veth-devices – which occur in pairs. The blue small rectangles on the Linux bridges instead represent ports associated with virtual tap-devices. I admit: This scenario of a virtual network inside a host is a bit academic. But it allows us to test what is possible with “veth”-pairs.

    Building the bridges

    On our Linux host we use virt-manager’s “connection details >> virtual networks” to define 2 virtual host only networks with bridges “virbr4” and “virbr6”.

    veth7

    Note: We do not allow for bridge specific “dhcp-services” and do not assign network addresses. We shall later configure addresses of the guests manually; you will find some remarks on a specific, network wide DHCP service at the end of the article.

    Then we implement and configure 2 KVM Linux guests (here Kali systems) – one with an Ethernet interface attached to “vibr4”; the other guest will be connected to “virbr6”. The next picture shows the network settings for guest “kali3” which gets attached to “virbr6”.

    veth8

    We activate the networks and boot our guests. Then on the guests (activate the right interface and deactivate other interfaces, if necessary) we need to set IP-addresses: The interfaces on kali2, kali3 must be configured manually – as we had not activated DHCP. kali2 gets the address “192.168.50.12”, kali3 the address “192.168.50.13”.

    veth9

    If we had defined several tap interfaces on our guest system kali3 we may have got a problem to identify the right interface associated with bridge. It can however be identified by its MAC and a comparison to the MACs of “vnet” devices in the output of the commands “ip link show” and “brctl show virbr6”.

    Now let us look what information we get about the bridges on the host :

    mytux:~ # brctl show virbr4
    bridge name     bridge id               STP enabled     interfaces
    virbr4          8000.5254007e553d       yes             virbr4-nic
                                                            vnet6
    mytux:~ # ifconfig virbr4 
    virbr4    Link encap:Ethernet  HWaddr 52:54:00:7E:55:3D  
              UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
    mytux:~ # ifconfig vnet6
    vnet6     Link encap:Ethernet  HWaddr FE:54:00:F2:A4:8D  
              inet6 addr: fe80::fc54:ff:fef2:a48d/64 Scope:Link
    ....
    mytux:~ # brctl show virbr6
    bridge name     bridge id               STP enabled     interfaces
    virbr6          8000.525400c0b06f       yes             virbr6-nic
                                                            vnet2
    mytux:~ # ip addr show virbr6 
    22: virbr6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
        link/ether 52:54:00:c0:b0:6f brd ff:ff:ff:ff:ff:ff
    mytux:~ # ifconfig vnet2
    vnet2     Link encap:Ethernet  HWaddr FE:54:00:B1:5D:1F  
              inet6 addr: fe80::fc54:ff:feb1:5d1f/64 Scope:Link
    .....
    mytux:~ # 
    

     
    Note that we do not see any IPv4-information on the “tap” devices vnet5 and vnet2 here. But note, too, that no IP-address has been assigned by the host to the bridges themselves.

    Ok, we have bridges virbr4 with guest “kali2” and a separate bridge virbr6 with KVM guest “kali3”. The host has no
    role in this game, yet. We are going to change this in the next step.

    Note that virt-manager automatically started the bridges when we started the KVM guests. Alternatively, we could have manually set
    mytux:~ # ip link set virbr4 up
    mytux:~ # ip link set virbr6 up
    We may also configure the bridges with “virt-manager” to be automatically started at boot time.

    Attaching the host to a bridge via veth

    According to our example we shall attach the host now by the use of a veth-pair to virbr4 . We create such a pair and connect one of its Ethernet interfaces to “virbr4”:

    mytux:~ # ip link add dev vmh1 type veth peer name vmh2       
    mytux:~ # brctl addif virbr4 vmh1
    mytux:~ # brctl show virbr4 
    bridge name     bridge id               STP enabled     interfaces
    virbr4          8000.5254007e553d       yes             virbr4-nic
                                                            vmh1
                                                            vnet6
    

     
    Now, we assign an IP address to interface vmh2 – which is not enslaved by any bridge:

    mytux:~ # ip addr add 192.168.50.1/24 broadcast 192.168.50.255 dev vmh2
    mytux:~ # ip addr show vmh2
    6: vmh2@vmh1: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN group default qlen 1000
        link/ether 42:79:e6:a7:fb:09 brd ff:ff:ff:ff:ff:ff
        inet 192.168.50.1/24 brd 192.168.50.255 scope global vmh2
           valid_lft forever preferred_lft forever
    

     
    We then activate vmh1 and vmh2. Next we need a route on the host to the bridge (and the guests at its ports) via vmh2 (!!) :

    mytux:~ # ip  link set vmh1 up
    mytux:~ # ip  link set vmh2 up
    mytux:~ # route add -net 192.168.50.0/24 dev vmh2
    mytux:~ # route
    Kernel IP routing table
    Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
    default         ufo             0.0.0.0         UG    0      0        0 br0
    192.168.10.0    *               255.255.255.0   U     0      0        0 br0
    ...
    192.168.50.0    *               255.255.255.0   U     0      0        0 vmh2
    ...
    

     
    Now we try whether we can reach guest “kali2” from the host and vice versa:

    mytux:~ # ping 192.168.50.12
    PING 192.168.50.12 (192.168.50.12) 56(84) bytes of data.
    64 bytes from 192.168.50.12: icmp_seq=1 ttl=64 time=0.291 ms
    64 bytes from 192.168.50.12: icmp_seq=2 ttl=64 time=0.316 ms
    64 bytes from 192.168.50.12: icmp_seq=3 ttl=64 time=0.322 ms
    ^C
    --- 192.168.50.12 ping statistics ---
    3 packets transmitted, 3 received, 0% packet loss, time 1999ms
    rtt min/avg/max/mdev = 0.291/0.309/0.322/0.024 ms
    
    root@kali2:~ # ping 192.168.50.1
    PING 192.168.50.1 (192.168.50.1) 56(84) bytes of data.
    64 bytes from 192.168.50.1: icmp_seq=1 ttl=64 time=0.196 ms
    64 bytes from 192.168.50.1: icmp_seq=2 ttl=64 time=0.340 ms
    64 bytes from 192.168.50.1: icmp_seq=3 ttl=64 time=0.255 ms
    ^C
    --- 192.168.50.1 ping statistics ---
    3 packets transmitted, 3 received, 0% packet loss, time 1998ms
    rtt min/avg/max/mdev = 0.196/0.263/0.340/0.062 ms
    

     
    So, we have learned that the host can easily be connected to a Linux bridge via an veth-pair – and that we do not need to assign an IP address to the bridge itself. Regarding the connection links the resulting situation is very similar to bridges where you use a physical “eth0” NIC as an uplink to external systems of a physical network.

    All in all I like this situation much better than having a bridge with an IP. During critical penetration tests we now can just plug vmh1 out of the bridge. And regarding packet-
    filters: We do not need to establish firewall-rules on the bridge itself – which has security implications if only done on level 3 – but on an “external” Ethernet device. Note also that the interface “vmh2” could directly be bridged by VMware (if you have more trust in VMware bridges) without producing guest isolation problems as described in a previous article (quoted above).

    Linking of two Linux bridges with each other

    Now, we try to create a link between our 2 Linux bridges. As Linux bridge cascading is forbidden, it is interesting to find out whether at least bridge linking is allowed. We use an additional veth-pair for this purpose:

    mytux:~ # ip link add dev vethb1 type veth peer name vethb2       
    mytux:~ # brctl addif virbr4 vethb1
    mytux:~ # brctl addif virbr4 vethb2
    mytux:~ # brctl show virbr4
    bridge name     bridge id               STP enabled     interfaces
    virbr4          8000.5254007e553d       yes             vethb1
                                                            virbr4-nic
                                                            vmh1
                                                            vnet6
    mytux:~ # brctl show virbr6
    bridge name     bridge id               STP enabled     interfaces
    virbr6          8000.2e424b32cb7d       yes             vethb2
                                                            virbr6-nic
                                                            vnet2
    
    
    mytux:~ # ip link set vethb1 up 
    mytux:~ # ip link set vethb2 up 
    

     
    Note, that the STP protocol is enabled on both bridges! (If you see something different you can manually activate STP via options of the brctl command.)

    Now, can we communicate from “kali3” at “virbr6” over the veth-pair and “virbr4” with the host?
    [Please, check the routes on all involved machines for reasonable entries first and correct if necessary; one never knows …].

    veth10
    and
    veth11

    Yes, obviously we can – and also the host can reach the virtual guest kali3.

    mytux:~ # ping -c4 192.168.50.13
    PING 192.168.50.13 (192.168.50.13) 56(84) bytes of data.
    64 bytes from 192.168.50.13: icmp_seq=1 ttl=64 time=0.259 ms
    64 bytes from 192.168.50.13: icmp_seq=2 ttl=64 time=0.327 ms
    64 bytes from 192.168.50.13: icmp_seq=3 ttl=64 time=0.191 ms
    64 bytes from 192.168.50.13: icmp_seq=4 ttl=64 time=0.287 ms
    
    --- 192.168.50.13 ping statistics ---
    4 packets transmitted, 4 received, 0% packet loss, time 2998ms
    rtt min/avg/max/mdev = 0.191/0.266/0.327/0.049 ms
    

     
    and of course
    veth12

    This was just another example of how we can use veth-pairs. We can link Linux bridges together – and all guests at both bridges are able to communicate with each other and with the host. Good !

    Connecting a virtual VMware bridge to a Linux bridge via a veth-pair

    Our last experiment involves a VMware WS bridge. We could use the VMware Network Editor to define a regular “VMware Host Only Network”. However, the bridge for such a network will automatically be created with an associated, enslaved Ethernet device for and on the host. And the bridge itself would automatically get an IP address – namely 192.168.50.1. There is no way known to me to avoid this – we
    would need to manually eliminate this address afterward.

    So, we take a different road:
    We first create a pair of veth devices – and then bridge (!) one of these veth devices by VMware:

    mytux:~ # ip link add dev vmw1 type veth peer name vmw2
    mytux:~ # brctl link virbr4 vmw1   
    mytux:~ # ip link set vmw1 up
    mytux:~ # ip link set vmw2 up
    mytux:~ # /etc/init.d/vmware restart
    

     
    To create the required VMware bridge to vmw2 we use VMware’s Virtual Network Editor”:

    veth13

    Note that by creating a specific bridge to one of the veth devices we have avoided any automatic IP address assignment (192.168.50.1) to the Ethernet device which would normally be created by VMware together with a host only bridge. Thus we avoid any conflicts with the already performed address assignment to “vmh2” (see above).

    In our VMware guest (hier a Win system) we configure the network device – e.g. with address 192.168.50.21 – and then try our luck:

    veth14

    Great! What we expected! Of course our other virtual clients and the host can also send packets to the VMware guest. I need not show this here explicitly.

    Summary

    veth-pairs are easy to create and to use. They are ideal tools to connect the host and other Linux or VMware bridges to a Linux bridge in a well defined way.

    A remark on DHCP

    Reasonable and precisely defined address assignment to the bridges and or virtual interfaces can become a problem with VMware as well as with KVM /virt-manager or virsh. Especially, when you want to avoid address assignment to the bridges themselves. Typically, when you define virtual networks in your virtualization environment a bridge is created together with an attached Ethernet interface for the host – which you may not really need. If you in addition enable DHCP functionality for the bridge/network the bridge itself (or the related device) will inevitably (!) and automatically get an address like 192.168.50.1. Furthermore related host routes are automatically set. This may lead to conflicts with what you really want to achieve.

    Therefore: If you want to work with DHCP I advise you to do this with a central DHCP service on the Linux host and not to use the DHCP services of the various virtualization environments. If you in addition want to avoid assigning IP addresses to the bridges themselves, you may need to work with DHCP pools and groups. This is beyond the scope of this article – though interesting in itself. An alternative would, of course, be to set up the whole virtual network with the help of a script, which may (with a little configuration work) be included as a unit into systemd.

    Make veth settings persistent

    Here we have a bit of a problem with Opensuse 13.2/Leap 42.1! The reason is that systemd in Leap and OS 13.2 is of version 210 and does not yet contain the service “systemd-networkd.service” – which actually would support the creation of virtual devices like “veth”-pairs during system startup. To my knowledge neither the “wicked” service used by Opensuse nor the “ifcfg-…” files allow for the definition of veth-pairs, yet. Bridge creation and address assignment to existing ethernet devices are, however, supported. So, what can we do to make things persistent?

    Of course, you can write a script that creates and configures all of your required veth-pairs. This script could be integrated in the boot process as a systemd-service to be started before the “wicked.service”. In addition you may
    configure the afterward existing Ethernet devices with “ifcfg-…”-files. Such files can also be used to guarantee an automatic setup of Linux bridges and their enslavement of defined Ethernet devices.

    Another option is – if you dare to take some risks – to fetch systemd’s version 224 from Opensuse’s Tumbleweed repository. Then you may create a directory “/etc/systemd/network” and configure the creation of veth-pairs via corresponding “….netdev”-files in the directory. E.g.:

    mytux # cat veth1.netdev 
    [NetDev]
    Name=vmh1
    Kind=veth
    [Peer]
    Name=vmh2
    

     
    I tried it – it works. However, systemd version 224 has trouble with the rearrangement of Leap’s apparmor startup. I have not looked at this in detail, yet.

    Nevertheless, have fun with veth devices in your virtual networks !

    VMware WS – bridging of Linux bridges and security implications

    Virtual bridges must be treated with care when security aspects get important. I stumbled into an unexpected kind of potential security topic when I experimented with a combination of KVM and VMware Workstation on one and the same Linux host. I admit that the studied scenario was a very special and academic one, but it really gave me an idea about the threat that some programs may change important and security relevant bridge parameters on a Linux system in the background – and you as an admin may only become aware of the change and its security consequences indirectly.

    The scenario starts with a virtual Linux bridge “br0” (created with “brctl”). This Linux bridge gets an IP-address and uses an assigned physical NIC for direct bridging to a physical LAN. You may want to read my previous blog article
    Opensuse – manuelles Anlegen von Bridge to LAN Devices (br0, br1, …) für KVM Hosts
    for more information about this type of direct bridging.

    In our scenario the Linux brigde itself then gets enslaved as an ethernet capable device by a VMware bridge. See also: Opensuse/Linux – KVM, VMware WS – 3 virtuelle Brücken zwischen den Welten (see the detailed description of “Lösungsansatz 2”).

    Under which circumstances may such a complicated arrangement be interesting or necessary?

    Direct attachment of virtualization guests to physical networks

    A Linux host-system for virtualization may contain KVM guests as well as VMware Workstation [WS] guests. A simple way to attach a virtual guest to some physical LAN of the Linux host (without routing) is to directly “bridge” a physical device of the host – as e.g. “enp8s0” – and then attach the guest to the virtual bridge. Related methods are available both for KVM and VMware. (We do not look at routing models for the communication of virtual guests with physical networks in this article).

    However, on a host, on which you started working with KVM before you began using VMware, you may already have bridged the physical device with a standard Linux bridge “br0” before you began/begin with the implementation of VMware guests.

    Opensuse, e.g., automatically sets up Linux bridges for all physical NICs when you configure the host for virtualization with YaST2. Or you yourself may have configured the Linux bridge and attached both the physical device and virtual “tap”-devices via the “brctl” and the “tunctl” commands. Setting up KVM guests via virt-manager also may have resulted in the attachment of further virtual NIC (tap-) devices to the bridge.

    The following sketch gives you an idea about a corresponding scenario:

    kvm_vmware_b2b_bridge_4

    Ignore for a moment the upper parts and the displayed private virtual networks there. In the lower part you recognize our Linux host’s “direct bridge to LAN” in grey color with a red stitched border line. I have indicated some ports in order to visualize the association with assigned (named) virtual and physical network devices.

    The bridge “br0” plays a double role: On one side it provides all the logic for packet forwarding to its target ports; on the other side it delivers the packets meant for the host itself to the Linux-kernel as if the bridge were a normal ethernet device. This is not done via an additional tap device of the host but directly. To indicate the difference “br0” is not sketched as a port.

    The virtual and
    physical devices are also visible e.g. in the output of CLI commands like “ifconfig”, “ip” or “wicked” with listing options as soon as the guest systems are started. The command “brctl show br0” would in addition inform you what devices are enslaved (via virtual ports) by the bridge “br0”.

    Note that the physical device has to operate in the so called “promiscuous mode” in this scenario and gets no IP-address.

    Bridging the Linux bridge by VMware

    Under such conditions VMware still offers you an option to bridge the physical device “enp8s0” – but after some tests you will find out that you are not able to transmit anything across the NIC “enp8s0” – because it is already enslaved by your Linux bridge … Now, you may think : Let us create an additional Linux TAP-device on the host, add it to bridge “br0” and then set up a VMware network bridged to the new tap device. However, I never succeeded with bridging from VMware directly to a Linux tap device. (If you know how to do this, send me an email …).

    There are 2 other possibilities for directly connecting your VMware guests (without routing) to the LAN. One is to “bridge the Linux bridge br0” – by administrative means of VMware WS. The other requires a direct connection of the VMware switch via its Host Interface to the Linux-bridge with the spanning tree protocol enabled. We only look at the solution based on “bridging the bridge” in this article.

    We achieve a cascaded bridge configuration in which the VMware switch enslaves a Linux bridge via VMwares “Virtual network editor”:

    vmware_vne_8

    Such a solution is working quite well. The KVM guests can communicate with the physical LAN as well as with the VMware guests as long as all guests NICs are configured to be part of the same network segment. And the VMware guests reach the physical LAN as well as the KVM guests.

    The resulting scenario is displayed in the following sketch:

    kvm_vmware_b2b_bridge_5

    In the lower part you recognize the “cascaded bridging”: The ethernet device corresponding to the bridge “br0” is enslaved by the VMware bridge “vmnet3” in the example. The drawing is only schematic – I do not really know how the “bridging of the bridge” is realized internally.

    Interestingly enough the command “brctl” on the Linux side does NOT allow for a similar type of cascaded “bridging of a Linux bridge”. You cannot attach a Linux bridge to a Linux bridge port there.

    We shall see that there is a good reason for this (maybe besides additional kernel module aspects and recursive stack handling).

    Basic KVM guest “isolation” on a Linux bridge ?

    A physical IEEE 802.1D bridge/switch may learn what MAC-addresses are reachable through which port, keep this information in an internal table and forward packets directly between ports without flooding packets to all ports. Is there something similar for virtual Linux bridges? The Linux bridge code implements a subset of the ANSI/IEEE 802.1d standard – see e.g. http://www.linuxfoundation.org/ collaborate/ workgroups/ networking/ bridge#What_does_ a_bridge_do.3F.
    So, yes, in a way: There is a so called “ageing” parameter of the bridge. If you set the ageing time to “0” by

    brctl setageingtime br0 0

    this setting brings the bridge into a “hub” like mode – all accepted packets are sent to all virtual ports – and a privileged user of a KVM guest
    may read all packets destined for all guests as well as for the LAN/WAN as soon as he switches the guest’s ethernet device into the “promiscuous mode”.

    However, if you set the ageing parameter to a reasonable value like “30” or “40” then the bridge works in a kind of “switch” mode which isolates e.g. KVM guests attached to it against each other. The bridge then keeps track of the MAC adresses attached to its virtual ports and forwards packets accordingly and exclusively (see the man pages of “brctl”). Later on in this article we shall prove this by the means of a packet sniffer. (We assume normal operation here – it is clear that also a virtual bridge can be attacked by methods of ARP-spoofing and/or ARP-flooding).

    Now, let us assume a situation with

    brctl setageingtime br0 40

    and let our Linux bridge be bridged by VMware. If I now asked you whether a KVM guest could listen to the data traffic of a VMware guest to the Internet, what would you answer?

    What does “wireshark” tell us about KVM guest isolation without VMware started?

    Let us first look at a situation where you have 2 KVM guests and VMware deactivated by

    /etc/init.d/vmware stop

    KVM guest 1 [kali2] may have an address of 192.168.0.20 in our test scenario, guest 2 [kali3] gets an address of 192.168.0.21. Both guests are attached to “br0” and can communicate with each other:

    kvm_ne_12

    kvm_ne_13

    We first set explicitly

    brctl setageing br0 30

    on the host. Does KVM guest 1 see the network traffic of KVM guest 2 with Internet servers?

    To answer this question we start “wireshark” on guest “kali3”, filter for packets of guest “kali2” and first look at ping traffic directly sent to “kali3”:
    kvm_ne_14

    Ok, as expected. Now, if we keep up packet tracking on kali3 and open a web page with “iceweasel” on kali2 we will not see any new packets in the wireshark window. This is the expected result. (Though it can not be displayed as it is difficult to visualize a non-appearance – you have to test it yourself). The Linux virtual bridge works more or less like a switch and directs the internet traffic of kali2 directly and exclusively to the attached “enp8s0”-port for the real ethernet NIC of the host. And incoming packets for kali2 are forwarded directly and exclusively from enp8s0 to the port for the vnet-device used by guest kali2. Thus, no traffic between guest “kali2” and a web server on the Tnternet can be seen on guest “kali3”.

    But now let us change the ageing-parameter:

    brctl setageing br0 0

    and reload our web page on kali2 again:

    kvm_ne_15

    Then we, indeed, see a full reaction of wireshark on guest kali3:
    kvm_ne_16

    All packets to and from the server are visible! Note that we have not discussed any attack vectors for packet sniffing here. We just discussed effects of special setting for the Linux bridge.

    Intermediate result: Setting the ageing-parameter
    on a linux bridge helps to isolate the KVM guests against each other.

    Can we see an Internet communication of a VMware guest on a KVM guest?

    We now reset the ageing parameter of the bridge and start the daemons for VMware WS on our Opensuse host:

     
    mytux:~ # brctl setageing br0 30 
    mytux:~ # /etc/init.d/vmware start
    Starting VMware services:                                                                 
       Virtual machine monitor                                             done               
       Virtual machine communication interface                             done               
       VM communication interface socket family                            done               
       Blocking file system                                                done               
       Virtual ethernet                                                    done               
       VMware Authentication Daemon                                        done               
       Shared Memory Available                                             done               
    mytux:~ #               
    

     
    Then we start a VMware guest with a reasonably configured IP address of 192.168.0.41 within our LAN segment:
    kvm_ne_18
    Then we load a web page on the VMware guest and have a parallel view at a reasonably filtered wireshark output on KVM guest “kali3”:

    kvm_ne_19

    Wireshark:
    kvm_ne_20

    Hey, we can see – almost – everything! A closer look reveals that we only capture ACK and data packets from the Internet server (and other sources, which is not visible in our picture) but not packages from the VMware guest to the Internet server or other target servers.

    Still and remarkably, we can capture all packets directed towards our VMware windows guest on a KVM guest. Despite an ageing parameter > 0 on the bridge “br0”!

    Guest isolation in our scenario is obviously broken! To be able to follow TCP-packets and thereby be able to decode the respective data streams fetched from a server to a distinct virtualization guest from other virtualization guests is not something any admin wants to see on a virtualization host! This at least indicates a potential for resulting security problems!

    So, how did this unexpected “sniffing” become possible?

    Bridges and the promiscuous mode of an attached physical device

    What does a virtual layer 2 Linux bridge with an attached (physical) device to a LAN do? It uses this special device to send packets from virtualization guests to the LAN and further into the Internet – and vice versa it receives packets from the Internet/LAN sent to the multiple attached guests or the host. Destination IP addresses are resolved to MAC-addresses via the ARP-protocol. A received packet is then transferred to the specific target guest attached at the bridge’s virtual ports. If the ageing parameter is set > 0 the bridge remembers the MAC-address/port association and works like a switch – and thus realizes the basic guest isolation discussed above.

    Let us have a look at the Linux bridge of our host :

     
    mytux:/proc/net # brctl show br0 
    bridge name     bridge id               STP enabled     interfaces
    br0             8000.1c6f653dfd1e       no              enp8s0
                    
                                            vnet0
                                                            vnet4
    

     
    The physical device “enp8s0” is attached. The additional network interfaces “vnet0”, “venet4” devices are tun-devices assigned to our 2 virtual KVM guests “kali2” and “kali3”.

    There is a very basic requirement for the bridge to be able to distribute packets coming from the LAN to their guest targets: The special physical device – here “enp8s0” – must be put into the “promiscuous mode”. This is required for the device to be able to receive and handle packets for multiple and different MAC- and associated IP-addresses.

    How can we see that the “enp8s0”-device on my test KVM host really is in a promiscuous state? Good question: Actually and as far as I know, this is a bit more difficult than you may expect. Most standard tools you may want to use –
    ifconfig, ip, “netstat -i” – fail to show the change if done in the background by bridge tools. However, a clear indication in my opinion is delivered by

    mytux:/proc/net # cat /sys/class/net/enp8s0/flags 
    0x1303
    

    Watch the 3rd position! If I understand the settings corrrectly, I would assume that anything bigger than 1 there indicates that the IFF_PROMISC flag of a structure describing NIC properties is set – and this means promiscuous mode. It is interesting to see what happens if you remove the physical interface from the bridge

     
    mytux:/proc/net # brctl delif br0 enp8s0
    mytux:/proc/net # cat /sys/class/net/enp8s0/flags 
    0x1003
    mytux:/proc/net # brctl addif br0 enp8s0
    mytux:/proc/net # cat /sys/class/net/enp8s0/flags 
    0x1303
    mytux:/proc/net # cat /sys/class/net/enp9s0/flags 
    0x1003
    mytux:/proc/net # cat /sys/class/net/vnet0/flags 
    0x1303
    mytux:/proc/net # cat /sys/class/net/vnet4/flags 
    0x1303
    mytux:/proc/net # 
    

     
    The promiscuous mode is obviously switched on by the “brctl addif”-action. As a comparison see the setting for the physical ethernet device “enp9s0” not connected to the bridge. (By the way: all interfaces attached to the bridge are in the same promiscuous mode as “enp8s0”. That does not help much for sniffing if the bridge works in a switch-like mode).

    Another way of monitoring the promiscuous state of a physical ethernet device in virtual bridge scenarios is to follow the and analyze the output of systemd’s “journalctl”:

    mytux:~ # brctl delif br0 enp8s0
    mytux:~ # brctl addif br0 enp8s0
    

    The parallel output of “journalctl -f” is:

     
    ...
    .Jan 12 15:21:59 rux kernel: device enp8s0 left promiscuous mode
    Jan 12 15:21:59 rux kernel: br0: port 1(enp8s0) entered disabled state
    ....
    ....
    Jan 12 15:22:10 mytux kernel: IPv4: martian source 192.168.0.255 from 192.168.0.200, on dev enp8s0
    ....
    ....
    Jan 12 15:22:13 mytux kernel: device enp8s0 entered promiscuous mode
    Jan 12 15:22:13 mytux kernel: br0: port 1(enp8s0) entered forwarding state
    Jan 12 15:22:13 mytux kernel: br0: port 1(enp8s0) entered forwarding state
    ...
    

     

    Promiscuous or non promiscuous state of the Linux bride itself?

    An interesting question is: In which state is our bridge – better the ethernet device it also represents (besides its port forwarding logic)? With stopped vmware-services? Let us see :

    mytux:~ # /etc/init.d/vmware stop
    ....
    mytux:~ # cat /sys/class/net/br0/flags 
    0x1003
    

     
    Obviously not in promiscuous mode. However, the bridge itself can work with ethernet packets addressed to it. In our configuration the bridge itself got an IP-address – associated with the
    host:

    mytux:~ # wicked show  br0 enp8s0 vnet0 vnet4
    enp8s0          enslaved
          link:     #2, state up, mtu 1500, master br0
          type:     ethernet, hwaddr 1c:6f:65:3d:fd:1e
          config:   compat:/etc/sysconfig/network/ifcfg-enp8s0
    
    br0             up
          link:     #5, state up, mtu 1500
          type:     bridge
          config:   compat:/etc/sysconfig/network/ifcfg-br0
          addr:     ipv4 192.168.0.19/24
          route:    ipv4 default via 192.168.0.200
    
    vnet4           device-unconfigured
          link:     #14, state up, mtu 1500, master br0
          type:     tap, hwaddr fe:54:00:27:4e:0a
    
    vnet0           device-unconfigured
          link:     #18, state up, mtu 1500, master br0
          type:     tap, hwaddr fe:54:00:85:20:d1
    mytux:~ # 
    

     
    This means that the bridge “br0” also acts like a normal non promiscuous NIC for packets addressed to the host. As the bridge itself is not in promiscuous mode it will NOT handle packets not addressed to any of its attached ports (and associated MAC-addresses) and just throw them away. The attached ports – and even the host itself (br0) – thus would not see any packets not addressed to them. Note: That the virtual bridge can separate the traffic between its promiscuous ports and thereby isolate them with “ageing > 0” is a reasonable but additional internal feature.

    What impact has VMware’s “bridging the bridge” on br0 ?

    However, “br0” becomes a part of a VMware bridge in our scenario – just like “enp8s0” became a part of the linux bridge “br0”. This happens in our case as soon as we start a virtual VMware machine inside the user interface of VMware WS. Thinking a bit makes it clear that the VMware bridge – independent of how it is realized internally – must put the device “br0″ (receiving external data form the LAN) into the promiscuous mode”. And really:

    mytux:/sys/class/net # cat /sys/class/net/br0/flags 
    0x1103
    mytux:/sys/class/net # 
    

     
    This means that the bridge now also accepts packets sent from the Internet/LAN to the VMware guests attached to the VMware bridge realized by a device “vmnet3”, which can be found under the “/dev”-directory. These packets arriving over “enp8s0” first pass the bridge “br0” before they are by some VMware magic picked up sat the output side of the Linux bridge and transmitted/forwarded to the VMware bridge.

    But, obviously the Linux program responsible for the handling of packets reaching the bridge “br0” via “enp8s0” and the further internal distribution of such packets kicks in first (or in parallel) and gets a problem as it now receives packets which cannot be directed to any of its known ports.

    Now, we speculate a bit: What does a standard physical 802.1D switch typically do when it gets packets addressed to it – but cannot identify the port to which it should transfer the packet? It just distributes or floods it to all of its ports!

    And hey – here we have found a very plausible reason for our the fact that we can read incoming traffic to our VMware guest from all KVM guests!

    Addendum 29.01.2016:
    Since Kernel 3.1 options can be set for controlling and stopping the flooding of packets for unknown target MACs to specific ports of a Linux bridge. See e.g.:
    http://events.linuxfoundation.org/ sites/ events/ files/ slides/ LinuxConJapan2014 _makita_0.pdf
    The respective command would be :

    echo 0 > /sys/class/net//brport/unicast_flooding

    It would have to be used on all tap ports (for the KVM guests) on the Linux bridge. Such a procedure may
    deliver a solution to the problem described above. I have tested it, yet.

    Conclusion

    Although our scenario is a bit special we have learned some interesting things:

    1. Bridging a Linux bridge as if it were a normal ethernet device from other virtualization environments is a dangerous game and should be avoided on productive virtualization hosts!
    2. A Linux bridge may be set into promiscuous mode by background programs – and you may have to follow and analyze flag entries in special files for a network device under “/sys/class/net/” or “journalctl” entries” to get notice of the change! Actually, on a productive system one should monitor these sources for status changes of network devices.
    3. A Linux bridge in promiscuous mode may react like a 802.1D device and flood its ports with packets for which it has not learned MAC adresses yet – this obviously has security and performance implications – especially when the flooding becomes a permanent action as in our scenario.
    4. Due to points 2 and 3 the status of a Linux bridge to a physical ethernet device of a host must be monitored with care.

    Regarding VMware and KVM/Linux-Bridges – what are possible alternatives for “linking” the virtual bridges of both sides to each other and enable communication between all attached guests?

    One simple answer is routing (via the virtualization host). But are there also solutions without routing?

    From what we have learned a scenario in which the virtual VMware switch is directly attached to a Linux bridge port seems to be preferable in comparison to “bridging the bridge”. Port specific MAC addresses for the traffic could then be learned by the Linux bridge – and we would get a basic guest isolation. Such a solution would be a variation of what I have described as “Lösung 3” in a previous article about “bridges between KVM and VMware”:
    KVM, VMware WS – 3 virtuelle Brücken zwischen den Welten
    However, in contrast to “Lösung 3” described there we would require a Linux bridge with activated STP protocol – because 2 ethernet devices would be enslaved by the Linux bridge. Whether such a scenario is really more secure, we may study in another article of this blog.

    Links

    See especially pages 301 – 304

    http://www.linuxfoundation.org/ collaborate/ workgroups/ networking/ bridge# What_does_a_ bridge_do.3F
    See especially the paragraph “Why is it worse than a switch?”

    Promiscuous mode analysis
    https://www.kernel.org/ doc/ Documentation/ ABI/ testing/ sysfs-class-net
    http://grokbase.com/ t/ centos/ centos/ 1023xtt5fd/ how-to-find-out-promiscuous-mode
    https://lists.centos.org/ pipermail/ centos/ 2010-February/ 090269.html
    Wrong info via “netstat -i ”
    http://serverfault.com/ questions/ 453213/ why-is-my-ethernet-interface-in-promiscuous-mode