dm-crypt/Luks – Begriffe, Funktionsweise und die Rolle des Hash-Verfahrens – II

Im letzten Post

dm-crypt/Luks – Begriffe, Funktionsweise und die Rolle des Hash-Verfahrens – I

bin ich auf ein paar wichtige Elemente und die Funktionsweise von LUKS eingegangen. Auf dieser Grundlage können wir uns nun ein paar Gedanken zum Einsatz von Hash-Verfahren – wie etwa SHA – im Rahmen von LUKS machen.

Folgendes hatten wir bereits herausgearbeitet:
Unter LUKS muss ein PBKDF-Verfahren aus einer beliebig langen Zugangs-Passphrase eine Byte-Sequenz definierter Länge berechnen. PBKDF steht dabei für “Password-based Key Definition Function”; konkret verwendet LUKS1 die Funktion PBKDF2 (RFC2898). Den Ergebnisstring von PBKDF2 hatte ich im letzten Beitrag “PB-Key” genannt. Bitte beachtet: “PB-Key” ist keine offizieller Begriff aus dem LUKS Wortschatz. Ich verwende den Begriff hier im Blog lediglich zur klaren Unterscheidung von anderen LUKS-Keys. Der PB-Key wird zur Ent-/Ver-Schlüsselung des sog. “Master-Keys” [MK] eingesetzt. Der Master-Key wird bei der Initialisierung eines LUKS-Devices zufällig erzeugt und später in einem symmetrischen Krypto-Verfahren zur Verschlüsselung der geheim zu haltenden Daten des Devices herangezogen. Der PB-Key ist im normalen Betrieb also nur ein Hilfsmittel, um den zentralen Master-Key zu ermitteln. Nichtsdestoweniger hängt die Sicherheit des Systems maßgeblich am PBKDF.

Das Erzeugen eines Strings definierter Länge (hier des PB-Keys) aus einem beliebig langen String (hier: der Passphrase) ist eine klassische Aufgabe von Hash-Algorithmen. Wir vermuten also (zu Recht!), dass LUKS im Rahmen des PBKDF Hash-Verfahren benutzen muss. Tatsächlich war das über lange Zeit default-mäßig SHA-1. Siehe etwa die Git-Hub-Dokumentation zu dm-crypt/Luks; unter dem Punkt 5 “Security Aspects” kann man dort lesen, dass LUKS standardmäßig SHA-1 einsetzt
(https://gitlab.com/ cryptsetup/ cryptsetup/ wikis/ FrequentlyAskedQuestions).

Nun gilt aber SHA-1 schon seit längerer Zeit als unsicher. Deshalb findet man im Internet zahlreiche besorgte Diskussionen zum Einsatz von SHA-1 unter LUKS (s. die vielen Links am Ende des Artikels). Wie ist dieses Thema zu beurteilen? Diese Frage hat auch mich als Anwender beschäftigt – u.a. auch wegen der Anforderung der DSGVO, als Dienstleister im regelmäßigen Umgang mit personenbezogenen Daten Sicherheit auf dem Stand der Technik zu belegen. Ich hatte aber auch ein grundsätzliches Interesse an dem Thema.

Continue reading

dm-crypt/Luks – Begriffe, Funktionsweise und die Rolle des Hash-Verfahrens – I

Die DSGVO zwingt Freelancer, sich stärker als zuvor mit der Sicherung von Daten, die im Rahmen von Kundenprojekten auf eigenen Systemen anfallen, zu befassen. Das beinhaltet dann u.a. Sicherungsmaßnahmen für den Verlust/Diebstahl von Laptops. Unter anderen Vorzeichen gilt das Gleiche übrigens auch für Server und PCs eines Home-Office. Man wird dann zum Schluss kommen, dass man eine Teil- oder Vollverschlüsselung seines Storage-Unterbaus vornehmen muss. Unter Linux liegt es nahe, dafür das Duo dm-crypt/LUKS einzusetzen. Recherchiert man im Internet zu diesem Toolset, so stößt man immer wieder auf Fragen zu grundsätzlichen Verfahren, die zum Einsatz gebracht werden. Leider sind die Antworten aus meine Sicht nicht immer so klar, wie man sich das wünschen würde. Gräbt man tiefer, landet man schnell bei relativ komplexen Darstellungen.

Im Rahmen von LUKS kommen u.a. auch Passphrase-Hashes zum Einsatz. Ich selbst bin anfänglich u.a. darüber gestolpert, dass dafür (angeblich) SHA1 herangezogen wird. Bei “SHA1” läuten automatisch ein paar Alarmglocken. Jemand, der mal mit “JtR” oder “hashcat” gearbeitet hat, wird ggf. auch beunruhigt durch Artikel wie
https://articles.forensicfocus.com/2018/02/22/bruteforcing-linux-full-disk-encryption-luks-with-hashcat/
https://dfir.science/2014/08/how-to-brute-forcing-password-cracking.html

Sind diese Sorgen berechtigt? Nein, ein wenig mehr Beschäftigung mit dem LUKS-Verfahren relativiert solche Ängste beträchtlich. Aus Diskussionen weiß, dass das Thema auch andere Linux-Freunde interessiert. Ich möchte in diesem Beitrag deshalb auf einige wichtige Ingredienzen und grundlegende Verfahren von LUKS eingehen – so wie ich sie sehe. Als Beleg werde ich Links zu weiterführender Literatur angeben und auch ein paar kompakte Zitate zu LUKS-internen Verfahrensschritten. Erst auf dieser Grundlage lässt sich dann der Einsatz von Hash-Verfahren vernünftig diskutieren.

Die praktische Anwendung und konkrete dm-crypt/LUKS-Kommandos für die (Voll-) Verschlüsselung eines Laptops behandle ich in einer späteren Serie von Posts.

Continue reading

Shrinking an encrypted partition with LVM on LUKS

Whilst experimenting with with encrypted partitions I wanted to shrink an “oversized” LUKS-encrypted partition. I was a bit astonished over the amount of steps required; in all the documentations on this topic some points were missing. In addition I also stumbled over the mess of GiB and GB units used in different tools. Safety considerations taking into account the difference are important to avoid data loss. Another big trap was the fact that the tool “parted” expects an end point on the disk given in GB when resizing. I nearly did it wrong.

Otherwise the sequence of steps described below worked flawlessly in my case. I could reboot the resized root-system enclosed in the encrypted partition after having resized everything.

I list up the necessary steps below in a rather brief form, only. I warn beginners: This is dangerous stuff and the risk for a full data loss not only in the partition you want to resize is very high. So, if you want to experiment yourself – MAKE A BACKUP of the whole disk first.

Read carefully through the steps before you do anything. If you are unsure about what the commands mean and what consequences they have do some research on the Internet ahead of any actions. Likewise in case of trouble or error messages.

You should in general be familiar with partitioning, LVM, dm-crypt and LUKS. I take no responsibility for any damage or data loss and cannot guarantee the applicability of the described steps in your situation – you have been warned.

My disk layout

My layout of the main installation was

SSD –> partition 3 –> LUKS –> LVM –> Group “vg1” –> Volume “lvroot” –> ext4 fs [80 GiB]

SSD –> partition 3 –> LUKS –> LVM –> Group “vg1” –> Volume “lvswap” –> swap fs [2 GiB]

The partition had a size around 104 GiB before shrinking. “lvroot” included a bootable root filesystem of 80 GiB in size. The swap volume (2 GiB) will later help us to demonstrate that shrinking may lead to gaps between logical LVM volumes. We shall shrink the file system, its volume, the volume group and also the encrypted the partition in the end.

In my case I could attach the disk in question to another computer where I performed the resizing operation. If you have just one computer available use a Live System from an USB stick or a DVD. Or boot another Linux installation available on one of the other disks of your system. [I recommend to always have a second small bootable installation available on one of the disks of your PC/laptop besides the main installation for daily work :-). This second installation can reside in an encrypted LVM volume (LUKS on LVM), too. It may support administration and save your life one day in case your main installation becomes broken. There you may also keep copies of the LUKS headers and keyfiles …].

Resizing will be done by a sequence of 16 steps – following the disk layout in reverse order as shown above; i.e. we proceed with first resizing the filesystem, then taking care about the LVM layout and eventually changing the partition size.

You open an encrypted partition with LVM on LUKS just as any dm-crypt/LUKS-partition by

cryptsetup open /dev/YOUR_DEVICE… MAPPING-Name

For the mapping name I shall use “cr-ext“. Note that manually closing the encrypted device requires to deactivate the volume groups in the kernel first; in our case

vgchange -a n vg1;
cryptsetup close cr-ext

Otherwise you may not be able to close your device.

A sequence of 16 steps ….

Let us now go through the details of the required steps – one by one.

Step 1: Get an overview over your block devices

You should use

lsblk

In my case the (external) disk appeared as “/dev/sdg” – the encrypted partition was located on “/dev/sdg3”.

***********

Step 2: Open the encrypted partition

cryptsetup open /dev/sdg3 cr-ext

Check that the mapping is done correctly by “la /dev/mapper” (la = alias for “ls -la”). Watch out not only for the “cr-ext”-device, but also for LVM-volumes inside the encrypted partition. They should appear automatically as distinct devices.

***********

Step 3: Get an overview on LVM structure

Use the following standard commands:

pvdisplay
vgdisplay
lvdisplay

“pvdisplay”/”vgdisplay” should show you a PV device “/dev/mapper/cr-ext” with volume group “vg1” (taking full size). If the volume groups are not displayed you may need to make them available by “vgchange –available y” (see the man pages).

“lvdisplay” should show you the logical volumes consistently with the entries in “/dev/mapper”. Note: “lvdisplay” actually shows the volumes as “/dev/vg1/lvroot”, whereas the link in “/dev/mapper” includes the name of the volume group: “/dev/mapper/vg1-lvroot”. In my case the original size of “lvroot” was 80 GiB and the size of the swap “lvswap” was 2 GiB.

***********

Step 4: Check the integrity of the filesystem

Use fsck:

mytux:~ # fsck /dev/mapper/vg1-lvroot 
fsck from util-linux 2.31.1
e2fsck 1.43.8 (1-Jan-2018)
/dev/mapper/vg1-lvroot: clean, 288966/5242880 files, 2411970/20971520 blocks

The filesystem should be clean. Otherwise run “fsck -f” – and answer the questions properly. “fsck” in write mode should be possible as we have not mounted the filesystem. We avoid doing so throughout the whole procedure.

***********

Step 5: Check the physical block size of the filesystem and the used space within the filesystem

Use “fdisk -l” to get the logical and physical block sizes

Disk /dev/mapper/vg1-lvroot: 80 GiB, 85899345920 bytes, 167772160 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes 

Use tunefs to get some block information:

tune2fs -l /dev/mapper/vg1-lvroot

to get the number of total blocks (20971520), blocksize (4096) and free blocks (18559550) – in my case :

Total = 85899345920 Bytes = 85 GB = 80 GiB 
free  = 76019916800 Bytes = 76 GB = 70,8 GiB

If you mount df -h may will show you less available space due to a 5% free limit set => 67G. So, not much space was occupied by my (Opensuse) test installation inside the volume.

***********

Step 6: Plan the reduced volume and filesystem sizes ahead – perform safety and limit considerations

Plan ahead the new diminished size of the LVM-Volume (“lvroot”). Let us say, we wanted 60G instead of the present 80G. Then we must reduce the filesystem [fs] size even more and use a safety-factor of 0.9 => 54G. I stress:

Make the filesystem size at least 10% smaller than the planned logical volume size!

Why this precaution? There may be a difference for the meaning of “G” for the use of the resize commands: “resize2fs” and the “lvresize“. It could be “GB” or “GiB” in either case.

Have a look into the man pages. This difference is really dangerous! What would be the worst case that could happen? We resize the filesystem in GiB and the volume size in GB – then the volume size would be too small.

So, let us assume that “lvresize” works with GB – then the 60G would correspond to 60 * 1024 * 1000 * 1000 Bytes = 6144000000 Bytes. Assume further that “resize2fs” works with GiB instead. Then we would get 54 * 1024 * 1024 * 1024 Bytes = 57982058496 Bytes. We would still have a reserve! We would potentially through away disk space;
however, we will compensate for it afterwards (see below).

If you believe what the present man-pages say: For “resize2fs” the “size” parameter can actually be given in units of “G” – but internally GiB are used. For “lvresize”, however, the situation is unclear.

In addition we have to check the potential reduction range against the already used space! In our case there is no problem (see step 5). However, how far could you maximally shrink if you needed to?

Minimum-consideration: Give the 10 GiB used according to step 5 a good 34% on top. (We always want a free space of 25%). Multiply by a factor of 1.1 to account for potential GB-GiB differences => 15 GiB. This is a rough minimum limit for the minimum size of the filesystem. The logical volume size should again be larger by a factor of 1.1 at least => 16.5 GB.

Having calculated your minimum you choose your actual shrinking size above this limit: You reduce the volume size by something between the 80 GiB and the 16.5 GiB. You set a number – let us say 60 GiB – and then you reduce the filesystem by a factor 0.9 more => 54 GB.

***********

Step 7: Shrink the filesystem

Check first that the fs is unmounted! Otherwise you may run into trouble:

mytux:~ # resize2fs /dev/mapper/vg1-lvroot 60G
resize2fs 1.43.8 (1-Jan-2018)
Filesystem at /dev/mapper/vg1-lvroot is mounted on /mnt2; on-line resizing required
resize2fs: On-line shrinking not supported

So unmount, run fsck again and then resize:

maytux:~ # umount /mnt2
mytux:~ # e2fsck -f /dev/mapper/vg1-lvroot
e2fsck 1.43.8 (1-Jan-2018)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/mapper/vg1-lvroot: 288966/5242880 files (0.2% non-contiguous), 2411970/20971520 blocks
mytux:~ # resize2fs /dev/mapper/vg1-lvroot 54G
resize2fs 1.43.8 (1-Jan-2018)
Resizing the filesystem on /dev/mapper/vg1-lvroot to <pre style="padding:8px;"> (4k) blocks.
The filesystem on /dev/mapper/vg1-lvroot is now 14155776 (4k) blocks long.
mytux:~ # e2fsck -f /dev/mapper/vg1-lvroot
e2fsck 1.43.8 (1-Jan-2018)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/mapper/vg1-lvroot: 288966/3538944 files (0.3% non-contiguous), 2304043/14155776 blocks

***********

Step 8: Shrink the logical volume

We use “lvreduce” to resize the LVM volume; the option parameter “L” together with a “size” determines how big the volume will become. [ Note that there is a subtle difference if you provide the size with a negative “-” sign! See the man page for the difference!]

mytux:~ # lvreduce -L 60G /dev/mapper/vg1-lvroot 
  WARNING: Reducing active logical volume to 60.00 GiB.
  THIS MAY DESTROY YOUR DATA (filesystem etc.)
Do you really want to reduce vg1/lvroot? [y/n]: y
  Size of logical volume vg1/lvroot changed from 80.00 GiB (20480 extents) to 60.00 GiB (15360 extents).
  Logical volume vg1/lvroot successfully resized.
mytux:~ # 

Hey, we worked in GiB – good to know!

***********

Step 9: Extend the fs to the volume size again

We compensate for the lost space of almost 6 GiB:

mytux:~ # resize2fs /dev/mapper/vg1-lvroot 
resize2fs 1.43.8 (1-Jan-2018)
Resizing the filesystem on /dev/mapper/vg1-lvroot to 15728640 (4k) blocks.
The filesystem on /dev/mapper/vg1-lvroot is now 15728640 (4k) blocks long.

Run gparted to get an overview over the present situation.

***********

Step 10: Check for gaps between the volumes of your LVM volume group

Besides the LVM volume for the OS and data related filesystem we had a swap volume, too! As we have not changed it there should be a gap between the volumes now – and indeed:

mytux:~ # pvs -v --segments /dev/mapper/cr-ext
    Wiping internal VG cache
    Wiping cache of LVM-capable devices
  PV                 VG  Fmt  Attr PSize   PFree  Start SSize LV     Start Type   PE Ranges                     
  /dev/mapper/cr-ext vg1 lvm2 a--  103.96g 41.96g     0 15360 lvroot     0 linear /dev/mapper/cr-ext:0-15359    
  /dev/mapper/cr-ext vg1 lvm2 a--  103.96g 41.96g 15360  5120            0 free                                 
  /dev/mapper/cr-ext vg1 lvm2 a--  103.96g 41.96g 20480   512 lvswap     0 linear /dev/mapper/cr-ext:20480-20991
  /dev/mapper/cr-ext vg1 lvm2 a--  103.96g 41.96g 20992  5623            0 free                                 
mytux:~ # 

Now, to close the gap we can move the second volume. This requires multiple trials with “pvmove –alloc anywhere”.

Note the information “/dev/mapper/cr-ext:20480-20991” and use this exactly in “pvmove”:

mytux:~ # pvmove --alloc anywhere /dev/mapper/cr-ext:20480-20991
  /dev/mapper/cr-ext: Moved: 0.20%
  /dev/mapper/cr-ext: Moved: 100.00%
mytux:~ # pvs -v --segments /dev/mapper/cr-ext                  
    Wiping internal VG cache
    Wiping cache of LVM-capable devices
  PV                 VG  Fmt  Attr PSize   PFree  Start SSize LV     Start Type   PE Ranges                     
  /dev/mapper/cr-ext vg1 lvm2 a--  103.96g 41.96g     0 15360 lvroot     0 linear /dev/mapper/cr-ext:0-15359    
  /dev/mapper/cr-ext vg1 lvm2 a--  103.96g 41.96g 15360  5632            0 free                                 
  /dev/mapper/cr-ext vg1 lvm2 a--  103.96g 41.96g 20992   512 lvswap     0 linear /dev/mapper/cr-ext:20992-21503
  /dev/mapper/cr-ext vg1 lvm2 a--  103.96g 41.96g 21504  5111            0 free                                 

You may have to apply “pvmove” several times, but each time with the new changed position parameters – be careful !!! (Are your backups OK???).

This may give you some seemingly erratic movements, but eventually, you should see a continuous alignment:

mytux:~ # pvmove --alloc anywhere /dev/mapper/cr-ext:20992-21503
  /dev/mapper/cr-ext: Moved: 1.76%
  /dev/mapper/cr-ext: Moved: 100.00%
mytux:~ # pvs -v --segments /dev/mapper/cr-ext
    Wiping internal VG cache
    Wiping cache of LVM-capable devices
  PV                 VG  Fmt  Attr PSize   PFree  Start SSize LV     Start Type   PE Ranges                     
  /dev/mapper/cr-ext vg1 lvm2 a--  103.96g 41.96g     0 15360 lvroot     0 linear /dev/mapper/cr-ext:0-15359    
  /dev/mapper/cr-ext vg1 lvm2 a--  103.96g 41.96g 15360   512 lvswap     0 linear /dev/mapper/cr-ext:15360-15871
  /dev/mapper/cr-ext vg1 lvm2 a--  103.96g 41.96g 15872 10743            0 free                                 
mytux:~ # 

***********

Step 11: Resize/reduce the physical LVM

After having aligned the volumes of the volume group vg1 we now can resize the physical LVM volume supporting the vg1 group. We use “pvresize” for this purpose. “pvresize” works in GiB – but its parameters just have a “G”.
In my example I shrank the physical LVM area down to 80G = 80GiB. This left more than enough space for my two LVM volumes.

mytux:~ # pvresize --setphysicalvolumesize 80G /dev/mapper/cr-ext  
/dev/mapper/cr-ext: Requested size 80.00 GiB is less than real size 103.97 GiB. Proceed?  [y/n]: y
  WARNING: /dev/mapper/cr-ext: Pretending size is 167772160 not 218034943 sectors.
  Physical volume "/dev/mapper/cr-ext" changed
  1 physical volume(s) resized / 0 physical volume(s) not resized
mytux:~ # pvdisplay
....   
  --- Physical volume ---
  PV Name               /dev/mapper/cr-ext
  VG Name               vg1
  PV Size               80.00 GiB / not usable 3.00 MiB
  Allocatable           yes 
  PE Size               4.00 MiB
  Total PE              20479
  Free PE               4607
  Allocated PE          15872
  PV UUID               eFf9we-......
   

Again, we obviously worked in GiB – good.

***********

Step 12: Set new size of the encrypted region

It is disputed whether this step is required. LUKS never registers the size of the encrypted area. However, if you had an active mounted partition … In our case, I think the step is not required, but …..

“cryptsetup resize” wants sectors as a size unit. We have to calculate a bit again based on the amount of sectors (each with 512 Byte; see below) used:

mytux:~ # cryptsetup status cr-ext
/dev/mapper/cr-ext is active.
  type:    LUKS1
  cipher:  aes-xts-plain64
  keysize: 512 bits
  key location: dm-crypt
  device:  /dev/sdg3
  sector size:  512
  offset:  65537 sectors
  size:    218034943 sectors
  mode:    read/write

We have to determine the new size in sectors to reduce the size of the LUKS area. I chose:
185329702 blocks (rounded up) – this corresponds to around 88.4 GiB; according to our 10% safety rule this should be sufficient.

mytux:~ # cryptsetup -b 185329702 resize cr-ext
mytux:~ # cryptsetup status cr-ext             
/dev/mapper/cr-ext is active.
  type:    LUKS1
  cipher:  aes-xts-plain64
  keysize: 512 bits
  key location: dm-crypt
  device:  /dev/sdg3
  sector size:  512
  offset:  65537 sectors
  size:    185329702 sectors
  mode:    read/write

***********

Step 13: Reduce the size of the physical partition – a pretty scary step!

This is one of the most dangerous steps. And irreversible … Interestingly, gparted will not let you do a resize. Yast’s partitioner allows for it after several confirmations. I used “parted” in the end.

But especially with parted you have to be extremely careful – parted expects a partition’s end position in GB. This means that to calculate the resulting size correctly you must also look at the partition’s starting position. In the end it is the difference that counts. Read the man pages and have a look into other resources.

mytux:~ # parted /dev/sdg
GNU Parted 3.2
Using /dev/sdg
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) print                                                            
Model: USB 3.0 (scsi)
Disk /dev/sdg: 512GB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags: 

Number  Start   End     Size    File system  Name     Flags
 1      1049kB  16.8MB  15.7MB               primary  bios_grub
 2      33.6MB  604MB   570MB                primary  boot, esp
 3      604MB   112GB   112GB                primary  legacy_boot, msftdata
 4      291GB   312GB   21.5GB                        lvm
 5      312GB   314GB   2147MB               primary  msftdata

(parted) resizepart 
Partition number? 3                                                       
End?  [112GB]? 89GB                                                       
Warning: Shrinking a partition can cause data loss, are you sure you want to continue?
Yes/No? Yes                                                               
(parted) print      
Model: USB 3.0 (scsi)
Disk /dev/sdg: 512GB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags: 

Number  Start   End     Size    File system  Name     Flags
 1      1049kB  16.8MB  15.7MB               primary  bios_grub
 2      33.6MB  604MB   570MB                primary  boot, esp
 3      604MB   89.0GB  88.4GB               primary  legacy_boot, msftdata
 4      291GB   312GB   21.5GB                        lvm
 5      312GB   314GB   2147MB               primary  msftdata

(parted) q                                                                
Information: You may need to update /etc/fstab.

As you see: We end up with a size of 88.4 GB and NOT 89 GB !!!

In addition we shall see in a second that the GB are NOT GiB here! So our safety factor is really important to get close to the aspired 80 GiB.

***********

Step 14: Set new size of the encrypted region

As we have some space left now in the partition we can enlarge the encryption area and later also the PV size to the possible maximum.
First, we set back the size of the encryption area to the full partition size; though not required in our case – it does not harm:

mytux:~ # cryptsetup  resize cr-ext
mytux:~ # cryptsetup status cr-ext
/dev/mapper/cr-ext is active and is in use.
  type:    LUKS1
  cipher:  aes-xts-plain64
  keysize: 512 bits
  key location: dm-crypt
  device:  /dev/sdg3
  sector size:  512
  offset:  65537 sectors
  size:    172582959 sectors
  mode:    read/write

***********

Step 15: Reset the PV size to the full partition size

We use “pvsize” again without any size parameter. The volume group will follow automatically.

mytux:~ # pvresize  /dev/mapper/cr-ext    
  Physical volume "/dev/mapper/cr-ext" changed
  1 physical volume(s) resized / 0 physical volume(s) not resized 
mytux:~ # pvdisplay
...  
  --- Physical volume ---
  PV Name               /dev/mapper/cr-ext
  VG Name               vg1
  PV Size               82.29 GiB / not usable 1000.50 KiB
  Allocatable           yes 
  PE Size               4.00 MiB
  Total PE              21067
  Free PE               5195
  Allocated PE          15872
  PV UUID               eFf9we-......
..
mytux:~ # vgdisplay
....
  --- Volume group ---
  VG Name               vg1
  System ID             
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  16
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                2
  Open LV               0
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               82.29 GiB
  PE Size               4.00 MiB
  Total PE              21067
  Alloc PE / Size       15872 / 62.00 GiB
  Free  PE / Size       5195 / 20.29 GiB
  VG UUID               dZakZi-......

Our logical volumes are alive, too, and have the right size.

mytux:~ # lvdisplay
...
  --- Logical volume ---
  LV Path                /dev/vg1/lvroot
  LV Name                lvroot
  VG Name                vg1
  LV UUID                5cDvmf-......
  LV Write Access        read/write
  LV Creation host, time install, 2018-11-06 15:33:52 +0100
  LV Status              available
  # open                 0
  LV Size                60.00 GiB
  Current LE             15360
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     1024
  Block device           254:17
   
  --- Logical volume ---
  LV Path                /dev/vg1/lvswap
  LV Name                lvswap
  VG Name                vg1
  LV UUID                lCU75L-......
  LV Write Access        read/write
  LV Creation host, time install, 2018-11-06 18:21:11 +0100
  LV Status              available
  # open                 0
  LV Size                2.00 GiB
  Current LE             512
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     1024
  Block device           254:18

And:

mytux:~ # pvs -v --segments /dev/mapper/cr-ext
    Wiping internal VG cache
    Wiping cache of LVM-capable devices
  PV                 VG  Fmt  Attr PSize  PFree  Start SSize LV     Start Type   PE Ranges                     
  /dev/mapper/cr-ext vg1 lvm2 a--  82.29g 20.29g     0 15360 lvroot     0 linear /dev/mapper/cr-ext:0-15359    
  /
dev/mapper/cr-ext vg1 lvm2 a--  82.29g 20.29g 15360   512 lvswap     0 linear /dev/mapper/cr-ext:15360-15871
  /dev/mapper/cr-ext vg1 lvm2 a--  82.29g 20.29g 15872  5195            0 free                                 

We also check with gparted:

Everything is fortunately Ok!

***********

Step 16: Closing and leaving the encrypted device

mytux:~ # vgchange -a n vg1

  0 logical volume(s) in volume group "vg1" now active
mytux:~ # 
mytux:~ # cryptsetup close cr-ext
mytux:~ # 

Conclusion

Shrinking an encrypted Luks partition which is used as a physical LVM volume and contains a LVM group with volumes is no rocket science. One has to apply the necessary shrinking steps starting from the innermost objects to the containers around them. I.e, we start shrinking the LVM volumes first, then take care of the LVM volume group and the LVM PV. Eventually, we deal with the encryption area and the hard disk partition.

During each step one has to be careful regarding the tools and sizing units: For each tool it has to be clarified whether it works in GB or GiB. Safety margins during each shrinking step have to be calculated and to be taken into account.

Links

https://www.rootusers.com/lvm-resize-how-to-decrease-an-lvm-partition/
https://wiki.archlinux.org/index.php/Resizing_LVM-on-LUKS
https://blog.shadypixel.com/how-to-shrink-an-lvm-volume-safely/
https://unix.stackexchange.com/questions/41091/how-can-i-shrink-a-luks-partition-what-does-cryptsetup-resize-do
LVM-fragmentation
See the discussion in the following link:
https://askubuntu.com/questions/196125/how-can-i-resize-an-lvm-partition-i-e-physical-volume
https://www.linuxquestions.org/questions/linux-software-2/how-do-i-lvm2-defrag-or-move-based-on-logical-volumes-689335/
Introduction into LVM
https://www.tecmint.com/create-lvm-storage-in-linux/
https://www.tecmint.com/extend-and-reduce-lvms-in-linux/

Cryptsetup close – not working for LVM on LUKS – device busy

I am working right now on some blog posts on a full dmcrypt/LUKS-encryption of Laptop SSDs. Whilst experimenting I stumbled across the following problem:

Normally, you would open an encrypted device by

cryptsetup open /dev/YourDevice cr-YourMapperLabel

(You have to replace the device-names and the mapper-labels by your expressions, of course). You would close the device by

cryptsetup close cr-YourMapperLabel

In my case I had created a test device “/dev/sdg3”; the mapper label was “cr-sg3”. However, whenever I tried to close this special crypto-device I got the following message:

mytux:~ # cryptsetup close  cr-g3
Device cr-g3 is still in use.

Ops, had I forgotten a mount or was something operating on the filesystem inside cr-sg3 still? Unfortunately, lsof did not show anything. No process seemed to block the device …

mytux:~ # lsof /dev/mapper/cr-g3
lsof: WARNING: can't stat() fuse.gvfsd-fuse file system /run/user/1111/gvfs
      Output information may be incomplete.

But

mytux:~ # dmsetup info -C  
Name                Maj Min Stat Open Targ Event  UUID                                                                
....
cr-g3               254  16 L--w    2    1      0 CRYPT-LUKS1-Some_UUID_Info-cr-g3                  
....

Note the “2” in the Open column. Clearly, there was a count of 2 of “something” that had a firm grip on the device. I also tried

mytux:~ # dmsetup remove -f cr-g3
device-mapper: remove ioctl on cr-g3  failed: Device or resource busy
Command failed

After this trial the dmsetup table even showed an error:

mytux:~ # dmsetup table
...
cr-g3: 0 218034943 error 
...

What the hack? On the Internet I found numerous messages related to similar experiences – but no real solution.

https://www.linuxquestions.org/ questions/ linux-kernel-70/ device-mapper-remove-ioctl-on-failed-device-or-resource-busy-4175481642/
https://serverfault.com/ questions/ 824425/ cryptsetup-cannot-close-mapped-device
https://gitlab.com/ cryptsetup/ cryptsetup/ issues/315
https://askubuntu.com/ questions/ 872044/ mounting-usb-disk-with-luks-encrypted-partion-fails-with-a-cryptsetup-device-al

Whatever problems these guys had – I tested with other quickly created encrypted filesystems. In most cases I had no problems to close them again. It took me a while to figure out that the cause of my problems was LVM2! Stupid me! Actually, I had defined a LVM volume group “vg1” inside the encrypted partition – on purpose. The related volumes could, of course, be seen in the “/dev/mapper/” directory

mytux:~ # la /dev/mapper 
total 0
drwxr-xr-x  2 root root     440 Nov  8 15:53 .
drwxr-xr-x 28 root root   12360 Nov  8 15:53 ..
crw-------  1 root root 10, 236 Nov  8 15:46 control
lrwxrwxrwx  1 root root       8 Nov  8 16:12 cr-g3 -> ../dm-16
...
lrwxrwxrwx  1 root root       8 Nov  8 15:54 vg1-lvroot -> ../dm-17
lrwxrwxrwx  1 root root       8 Nov  8 15:53 vg1-lvswap -> ../dm-18
...

r

and also in “dmsetup”

mytux:~ # dmsetup info -C  
Name                Maj Min Stat Open Targ Event  UUID                                                                
.... 
vg1-lvswap          254  18 L--w    0    1      0 LVM-Some_UUID_Info
vg1-lvroot          254  17 L--w    0    1      0 LVM-Some_UUID_Info
....
cr-g3               254  16 L--w    2    1      0 CRYPT-LUKS1-Some_UUID_Info-cr-g3                  
....

So do not let yourself get confused by the fact that “lsof” may not help you to detect processes which access the crypto devices in question! It may be the kernel (with its LVM component) itself! (triggered by udev …). This problem will always appear when you work with LVM on top of LUKS. It also explained the count of 2. The solution was simple afterwards:

mytux:~ # vgchange -a n vg1
  0 logical volume(s) in volume group "vg1" now active
mytux:~ # cryptsetup close cr-g3
mytux:~ # 

Conclusion:
Always remember to explicitly deactivate LVM volume groups that may exist on LUKS encrypted partitions before closing the crypto-device! Have a look with the commands “lvs” or “vgdisplay” first!

Addendum 12.11.2018:
The last days I have read a bit more on crypto-partitions and volumes on the Internet. I came across an article which describes the deactivation of the volume groups before closing the crypto-device clearly and explicitly:
https://blog.sleeplessbeastie.eu/ 2015/11/16/ how-to-mount-encrypted-lvm-logical-volume/

 

Fun with veth-devices, Linux bridges and VLANs in unnamed Linux network namespaces – VIII

In the last post of this series

Fun with veth-devices, Linux bridges and VLANs in unnamed Linux network namespaces – VII [Theoretical considerations regarding the connection of a network namespace or container to two separated VLANs]

we discussed two different approaches to connect a network namespace (or container) “netns9” to two (or more) separated VLANs. Such a network namespace could e.g. represent an administrative system (for example in form of a LXC container) for both VLANs. It has its own connection to the virtual Linux bridge which technically defines the VLANs by special port configurations. See the picture below, where we represented a VLAN1 by a member network namespace netns1 and a VLAN2 by a member netns4:

The solution on the left side is based on a bridge in an intermediate network namespace and packet tagging up into the namespace for the VLANs’ common member system netns9. The approach on the right side of the graphics uses a bridge, too, but without packet tagging along the connection to the common VLAN member system. In our analysis in the last post we assumed that we would have to compensate for this indifference by special PVID/VID settings.

The previous articles of this series already introduced general Linux commands for network namespace creation and the setup of VLANs via Linux bridge configurations. See e.g.: Fun with … – IV [Virtual VLANs for network namespaces (or containers) and rules for VLAN tagging at Linux bridge ports]. We shall use these methods in the present and a coming post to test configurations for a common member of two VLANs. We want to find out whether the theoretically derived measures regarding route definitions in netns9 and special PVID/VID-settings at the bridge work as expected. A test of packet filtering at bridge ports which we regarded as important for security is, however, postponed to later posts.

Extension of our test environment

First, we extend our previous test scenario by yet another network namespace “netns9“.

Our 2 VLANs in the test environment are graphically distinguished by “green” and “pink” tags (corresponding to different VLAN ID numbers). netns9 must be able to communicate with systems in both VLANs. netns9 shall, however, not become a packet forwarder between the VLANs; the VLANs shall remain separated despite the fact that they have a common member. We expect, that a clear separation of communication paths to the VLANs requires a distinction between network targets already inside netns9.

Bridge based solutions with packet tagging and veth sub-interfaces

There are two rather equivalent solutions for the connection of netns9 to brx in netns3; see the schematic graphics below:

Both solutions are based on veth sub-interfaces inside netns9. Thus, both VLAN connections are properly terminated in netns9. The approach depicted on the right side of the graphics uses a pure trunk port at the bridge; but also this solutions makes use of packet tagging between brx and netns9.

Note that we do not need to used tagged packets along the connections from bridge brx to netns1, netns2, netns4, netns5. The VLANs are established by the PVID/VID settings at the bridge ports and forwarding rules inside a VLAN aware bridge. Note also that our test environment contains an additional bridge bry and additional network namespaces.

We first concentrate on the solution on the left side with veth sub-interfaces at the bridge. It is easy to switch to a trunk port afterwards.

The required commands for the setup of the test environment are given below; you may scroll and copy the commands to the prompt of a terminal window for a root shell:

unshare --net --uts /bin/bash &
export pid_netns1=$!
unshare --net --uts /bin/bash &
export pid_netns2=$!
unshare --net --uts /bin/bash &
export pid_netns3=$!
unshare --net --uts /bin/bash &
export pid_netns4=$!
unshare --net --uts /bin/bash &
export pid_netns5=$!
unshare --net --uts /bin/bash &
export pid_netns6=$!
unshare --net --uts /bin/bash &
export pid_netns7=$!
unshare --net --uts /bin/bash &
export pid_netns8=$!
unshare --net --uts /bin/bash &
export pid_netns9=$!

# assign different hostnames  
nsenter -t $pid_netns1 -u hostname netns1
nsenter -t $pid_netns2 -u hostname netns2
nsenter -t $pid_netns3 -u hostname netns3
nsenter -t $pid_netns4 -u hostname netns4
nsenter -t $pid_netns5 -u hostname netns5
nsenter -t $pid_netns6 -u hostname netns6
nsenter -t $pid_netns7 -u hostname netns7
nsenter -t $pid_netns8 -u hostname netns8
nsenter -t $pid_netns9 -u hostname netns9
     
# set up veth devices in netns1 to netns4 and in netns9 with connections to netns3  
ip link add veth11 netns $pid_netns1 type veth peer name veth13 netns $pid_netns3
ip link add veth22 netns $pid_netns2 type veth peer name veth23 netns $pid_netns3
ip link add veth44 netns $pid_netns4 type veth peer name veth43 netns $pid_netns3
ip link add veth55 netns $pid_netns5 type veth peer name veth53 netns $pid_netns3
ip link add veth99 netns $pid_netns9 type veth peer name veth93 netns $pid_netns3

# set up veth devices in netns6 and netns7 with connection to netns8   
ip link add veth66 netns $pid_netns6 type veth peer name veth68 netns $pid_netns8
ip link add veth77 netns $pid_netns7 type veth peer name veth78 netns $pid_netns8

# Assign IP addresses and set the devices up 
nsenter -t $pid_netns1 -u -n /bin/bash
ip addr add 192.168.5.1/24 brd 192.168.5.255 dev veth11
ip link set veth11 up
ip link set lo up
exit
nsenter -t $pid_netns2 -u -n /bin/bash
ip addr add 192.168.5.2/24 brd 192.168.5.255 dev veth22
ip link set veth22 up
ip link set lo up
exit
nsenter -t $pid_netns4 -u -n /bin/bash
ip addr add 192.168.5.4/24 brd 192.168.5.255 dev veth44
ip link set veth44 up
ip link set lo up
exit
nsenter -t $pid_netns5 -u -n /bin/bash
ip addr add 192.168.5.5/24 brd 192.168.5.255 dev veth55
ip link set veth55 up
ip link set lo up
exit
nsenter -t $pid_netns6 -u -n /bin/bash
ip addr add 192.168.5.6/24 brd 192.168.5.255 dev veth66
ip link set veth66 up
ip link set lo up
exit
nsenter -t $pid_netns7 -u -n /bin/bash
ip addr add 192.168.5.7/24 brd 192.168.5.255 dev veth77
ip link set veth77 up
ip link set lo up
exit
nsenter -t $pid_netns9 -u -n /bin/bash
ip addr add 192.
168.5.9/24 brd 192.168.5.255 dev veth99
ip link set veth99 up
ip link set lo up
exit

# set up bridge brx and its ports 
nsenter -t $pid_netns3 -u -n /bin/bash
brctl addbr brx  
ip link set brx up
ip link set veth13 up
ip link set veth23 up
ip link set veth43 up
ip link set veth53 up
brctl addif brx veth13
brctl addif brx veth23
brctl addif brx veth43
brctl addif brx veth53
exit

# set up bridge bry and its ports 
nsenter -t $pid_netns8 -u -n /bin/bash
brctl addbr bry  
ip link set bry up
ip link set veth68 up
ip link set veth78 up
brctl addif bry veth68
brctl addif bry veth78
exit

# set up 2 VLANs on each bridge 
nsenter -t $pid_netns3 -u -n /bin/bash
ip link set dev brx type bridge vlan_filtering 1
bridge vlan add vid 10 pvid untagged dev veth13
bridge vlan add vid 10 pvid untagged dev veth23
bridge vlan add vid 20 pvid untagged dev veth43
bridge vlan add vid 20 pvid untagged dev veth53
bridge vlan del vid 1 dev brx self
bridge vlan del vid 1 dev veth13
bridge vlan del vid 1 dev veth23
bridge vlan del vid 1 dev veth43
bridge vlan del vid 1 dev veth53
bridge vlan show
exit
nsenter -t $pid_netns8 -u -n /bin/bash
ip link set dev bry type bridge vlan_filtering 1
bridge vlan add vid 10 pvid untagged dev veth68
bridge vlan add vid 20 pvid untagged dev veth78
bridge vlan del vid 1 dev bry self
bridge vlan del vid 1 dev veth68
bridge vlan del vid 1 dev veth78
bridge vlan show
exit

# Create a veth device to connect the two bridges 
ip link add vethx netns $pid_netns3 type veth peer name vethy netns $pid_netns8
nsenter -t $pid_netns3 -u -n /bin/bash
ip link add link vethx name vethx.50 type vlan id 50
ip link add link vethx name vethx.60 type vlan id 60
brctl addif brx vethx.50
brctl addif brx vethx.60
ip link set vethx up
ip link set vethx.50 up
ip link set vethx.60 up
bridge vlan add vid 10 pvid untagged dev vethx.50
bridge vlan add vid 20 pvid untagged dev vethx.60
bridge vlan del vid 1 dev vethx.50
bridge vlan del vid 1 dev vethx.60
bridge vlan show
exit

nsenter -t $pid_netns8 -u -n /bin/bash
ip link add link vethy name vethy.50 type vlan id 50
ip link add link vethy name vethy.60 type vlan id 60
brctl addif bry vethy.50
brctl addif bry vethy.60
ip link set vethy up
ip link set vethy.50 up
ip link set vethy.60 up
bridge vlan add vid 10 pvid untagged dev vethy.50
bridge vlan add vid 20 pvid untagged dev vethy.60
bridge vlan del vid 1 dev vethy.50
bridge vlan del vid 1 dev vethy.60
bridge vlan show
exit

# Add subinterfaces in netns9
nsenter -t $pid_netns9 -u -n /bin/bash
ip link add link veth99 name veth99.10 type vlan id 10
ip link add link veth99 name veth99.20 type vlan id 20
ip link set veth99 up
ip link set veth99.10 up
ip link set veth99.20 up
exit

# Add subinterfaces in netns3
nsenter -t $pid_netns3 -u -n /bin/bash
ip link add link veth93 name veth93.10 type vlan id 10
ip link add link veth93 name veth93.20 type vlan id 20
ip link set veth93 up
ip link set veth93.10 up
ip link set veth93.20 up
brctl addif brx veth93.10
brctl addif brx veth93.20
bridge vlan add vid 10 pvid untagged dev veth93.10
bridge vlan add vid 20 pvid untagged dev veth93.20
bridge vlan del vid 1 dev veth93.10
bridge vlan del vid 1 dev veth93.20
exit

We just have to extend the command list of the experiment conducted already in the second to last post by some more lines which account for the setup of netns9 and its connection to the bridge “brx” in netns3.

Now, we open a separate terminal, which inherits the defined environment variables (e.g. on KDE by “konsole &>/dev/null &”), and try a ping from netns9 to netns7:

mytux:~ # nsenter -t $pid_netns9 
-u -n /bin/bash
netns9:~ # ping 192.168.5.1
PING 192.168.5.1 (192.168.5.1) 56(84) bytes of data.
^C
--- 192.168.5.1 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1008ms

netns9:~ # ping 192.168.5.7
PING 192.168.5.7 (192.168.5.7) 56(84) bytes of data.
^C
--- 192.168.5.7 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1006ms

netns9:~ # 

Obviously, the pings failed! The reason is that we forgot to set routes in netns9! Such routes are, however, vital for the transport of e.g. ICMP answering and request packets from netns9 to members of the two VLANs. See the last post for details. We add the rules for the required routes:

#Set routes in netns9 
nsenter -t $pid_netns9 -u -n /bin/bash
route add 192.168.5.1 veth99.10                                                     
route add 192.168.5.2 veth99.10                                                    
route add 192.168.5.4 veth99.20
route add 192.168.5.5 veth99.20                                                    
route add 192.168.5.6 veth99.10
route add 192.168.5.7 veth99.20
exit

By these routes we, obviously, distinguish different paths: Packets heading for e.g. netns1 and netns2 go through a different interface than packets sent e.g. to netns4 and netns5. Now, again, in our second terminal window:

mytux:~ # nsenter -t $pid_netns9 -u -n /bin/bash 
netns9:~ # ping 192.168.5.1 -c2
PING 192.168.5.1 (192.168.5.1) 56(84) bytes of data.
64 bytes from 192.168.5.1: icmp_seq=1 ttl=64 time=0.067 ms
64 bytes from 192.168.5.1: icmp_seq=2 ttl=64 time=0.083 ms

--- 192.168.5.1 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.067/0.075/0.083/0.008 ms
netns9:~ # ping 192.168.5.7 -c2
PING 192.168.5.7 (192.168.5.7) 56(84) bytes of data.
64 bytes from 192.168.5.7: icmp_seq=1 ttl=64 time=0.079 ms
64 bytes from 192.168.5.7: icmp_seq=2 ttl=64 time=0.078 ms

--- 192.168.5.7 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.078/0.078/0.079/0.008 ms
netns9:~ # ping 192.168.5.4 -c2
PING 192.168.5.4 (192.168.5.4) 56(84) bytes of data.
64 bytes from 192.168.5.4: icmp_seq=1 ttl=64 time=0.151 ms
64 bytes from 192.168.5.4: icmp_seq=2 ttl=64 time=0.076 ms

--- 192.168.5.4 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.076/0.113/0.151/0.038 ms

Thus, we have confirmed our conclusion from the last article that we need route definitions in a common member of two VLANs if and when we terminate tagged connection lines by veth sub-interfaces inside such a network namespace or container.

But are our VLANs still isolated from each other?
We open another terminal and try pinging from netns1 to netns4, netns7 and netns2:

mytux:~ # nsenter -t $pid_netns1 -u -n /bin/bash
netns1:~ # ping 192.168.5.4
PING 192.168.5.4 (192.168.5.4) 56(84) bytes of data.
^C
--- 192.168.5.4 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 2015ms

netns1:~ # ping 192.168.5.7
PING 192.168.5.7 (192.168.5.7) 56(84) bytes of data.
^C
--- 192.168.5.7 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1007ms

netns1:~ # ping 192.168.5.2
PING 192.168.5.2 (192.168.5.2) 56(84) bytes of data.
64 bytes from 192.168.5.2: icmp_seq=1 ttl=64 time=0.195 ms
64 bytes from 192.168.5.2: icmp_seq=2 ttl=64 time=0.102 ms
^C
--- 192.168.5.
2 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.102/0.148/0.195/0.048 ms
netns1:~ # 

And in reverse direction :

mytux:~ # nsenter -t $pid_netns5 -u -n /bin/bash                                               
netns5:~ # ping 192.168.5.4
PING 192.168.5.4 (192.168.5.4) 56(84) bytes of data.                                           
64 bytes from 192.168.5.4: icmp_seq=1 ttl=64 time=0.209 ms                                     
64 bytes from 192.168.5.4: icmp_seq=2 ttl=64 time=0.071 ms                                     
^C                                                                                             
--- 192.168.5.4 ping statistics ---                                                            
2 packets transmitted, 2 received, 0% packet loss, time 999ms                                  
rtt min/avg/max/mdev = 0.071/0.140/0.209/0.069 ms                                              
netns5:~ # ping 192.168.5.1
PING 192.168.5.1 (192.168.5.1) 56(84) bytes of data.                                           
^C                                                                                             
--- 192.168.5.1 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1008ms

netns5:~ # 

Good! As expected!

Forwarding between two VLANs?

We have stressed in the last post that setting routes should clearly be distinguished from “forwarding” if we want to keep our VLANs separated:

We have NOT enabled forwarding in netns9. If we had done so we would have lost the separation of the VLANs and opened a direct communication line between the VLANs.

Let us – just for fun – test the effect of forwarding in netns9:

netns9:~ # echo 1 > /proc/sys/net/ipv4/conf/all/forwarding
netns9:~ # 

But still:

netns5:~ # ping 192.168.5.1
PING 192.168.5.1 (192.168.5.1) 56(84) bytes of data.
^C
--- 192.168.5.1 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 1999ms

Enabling forwarding in netns9 alone is obviously not enough to enable a packet flow in both directions! A little thinking , however, shows:

If we e.g. want ARP resolution and pinging from netns5 to netns1 to work via netns9 we must establish further routes both in netns1 and netns5. Reason: Both network namespaces must be informed that netns9 now works as a gateway for both request and answering packets:

netns1:~ # route add 192.168.5.5 gw 192.168.5.9
netns5:~ # route add 192.168.5.1 gw 192.168.5.9

Eventually:

netns5:~ # ping 192.168.5.1
PING 192.168.5.1 (192.168.5.1) 56(84) bytes of data.
64 bytes from 192.168.5.1: icmp_seq=1 ttl=63 time=0.186 ms
64 bytes from 192.168.5.1: icmp_seq=2 ttl=63 time=0.134 ms
^C
--- 192.168.5.1 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.134/0.160/0.186/0.026 ms
netns5:~ # 

So, yes, forwarding outside the bridge builds a connection between otherwise separated VLANs. In connection with a packet filter this could be used to allow some hosts of a VLAN1 to reach e.g. some servers in a VLAN2. But this is not the topic of this post. So, do not forget to disable the forwarding in netns9 again for further experiments:

netns9:~ # echo 0 > /proc/sys/net/ipv4/conf/all/forwarding
netns9:~ # 

Bridge based solutions with packet tagging and a trunk port at the Linux bridge

The following commands replace the sub-interface ports veth93.10 and veth93.20 at the bridge by a single trunk port:

# Change veth93 to trunk like interface in brx 
nsenter -t $pid_netns3 -u -n /bin/bash
brctl delif brx veth93.10
brctl delif brx veth93.20
ip link del dev veth93.10
ip link del dev veth93.20
brctl addif brx veth93
bridge vlan add vid 10 tagged dev veth93
bridge vlan add vid 20 tagged dev veth93
bridge vlan del vid 1 dev veth93
bridge vlan show
exit 

Such a solution works equally well:

netns9:~ # ping 192.168.5.4 -c2
PING 192.168.5.4 (192.168.5.4) 56(84) bytes of data.
64 bytes from 192.168.5.4: icmp_seq=1 ttl=64 time=0.145 ms
64 bytes from 192.168.5.4: icmp_seq=2 ttl=64 time=0.094 ms

--- 192.168.5.4 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.094/0.119/0.145/0.027 ms
netns9:~ # ping 192.168.5.6 -c2
PING 192.168.5.6 (192.168.5.6) 56(84) bytes of data.
64 bytes from 192.168.5.6: icmp_seq=1 ttl=64 time=0.177 ms
64 bytes from 192.168.5.6: icmp_seq=2 ttl=64 time=0.084 ms

--- 192.168.5.6 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.084/0.130/0.177/0.047 ms
netns9:~ # 

Summary and outlook

It is easy to make a network namespace or container a common member of two separate VLANs realized by a Linux bridge. You have to terminate virtual veth connections, which transport tagged packets from both VLANs, properly inside the common target namespace by sub-interfaces. As long as we do not enable forwarding in the common namespace the VLANs remain separated. But routes need to be defined to direct packets from the common member to the right VLAN.

In the next post we look at commands to realize a connection of bridge based VLANs to a common network namespace with untagged packets. Such solutions are interesting for connecting multiple virtual VLANs to routers to external networks.