Opensuse Leap 15.4 on a PC – I – Upgrade from Leap 15.3 – repositories, Nvidia, Vmware WS, KVM, Plasma widget deficits

Today I upgraded a desktop PC with an Nvidia graphics card from Opensuse Leap 15.3 to Leap 15.4. I really must say: This was one of the smoothest upgrades I have ever experienced with Opensuse. However, there were again a few standard problems which had to be solved.

In this first post I summarize the upgrade steps and check the basic functionality of KDE/Plasma, VMware Workstation and KVM. I will also comment on some deficits of Plasma widgets for system monitoring.

In a second posts I summarize a brief test of Xwayland. A third post will discuss how to define a screen order for situations where multiple screens are attached to your PC. A brief fourth post will discuss PHP8 and Apache2 settings. We will also check that Eclipse works for PHP and Python/PyDeV. I will also verify the functionality of Python3 (3.9/3.10), Tensorflow2 and Jupyter after the upgrade. A last post provides solutions for multi-soundcard setups with Pulseaudio and pavucontrol.

PC components – and major SW

My PC has two major raid-systems – one with multiple SSDs (at an Intel controller, set up and controlled with the help of Linux mdraid) and one with HDs (3ware controller). The raid systems host LVM volumes and ordinary partitions. The graphics card is a relatively old Nvidia card. Two sound cards (X-Fi, Xonar D2X) are used in parallel. My standard desktop is KDE/Plasma scaled over 3 screens. For virtualization of Windows 10 VMware WS 16 is used. Virtualized Linux systems, however, run on KVM/qemu with virtual spice screens. PHP development is done with the help of Eclipse and a local Apache server. Machine Learning development is done with Python3 (3.9), PyDEV (Eclipse), Tensorflow2/Keras, Juypter.

Upgrade procedure

Regarding the basic upgrade procedure you can adapt steps which I have extensively described for a laptop here. You can just ignore any steps regarding an Optimus based combination of graphics cards.

List of steps

  • Step 1:Make backups! Especially of the LVM volumes or partitions which host your root filesystem and your /home-directory.
  • Step 2: Check your repositories! Save a list of your active ones and their URLs. Reason: Some repository paths below “download.opensuse.org” have changed and you need to include these repos again into zypper. Below I give some information on repositories which directly can be upgraded with the help of ${releasever}.
  • Step 3: Update your current RPMs.
  • Step 4: In a terminal window:
    mytux:~ # zypper refresh
    mytux:~ # zypper update

    Then restart your system and verify that it is working.

  • Step 5: In a terminal window change repo URLs to include ${releasever}
    mytux:~ # sed -i 's/15.3/${releasever}/g' /etc/zypp/repos.d/*.repo
    mytux:~ # sed -i 's/$releasever/${releasever}/g' /etc/zypp/repos.d/*.repo
    
  • Step 6: In a terminal window test a refesh for the Leap 15.4 repos:
    mytux:~ # zypper --releasever=15.4 refresh
    

    If there are problems analyze the reason and eliminate those repositories whose URL have changed. Re-check the success of the refresh.

  • Step 7: In a terminal window download the new RPMs
    mytux:~ #  zypper --releasever=15.4 dup --download-only --allow-vendor-change
    
  • Step 8: Close the graphical desktop and switch to a TTY (Ctrl-Alt-F1). Stop the display server and then upgrade
    mytux:~ # init 3 
    mytux:~ # zypper --no-refresh --releasever=15.4 dup --allow-vendor-change
    
  • Step 9: Pray and reboot to your upgraded Opensuse Leap 15.4

With selected repositories discussed below all these steps could be performed without major problems.

My good news are: There were no problems regarding drivers for the Nvidia card (see below), the raid controllers and my multiple Raid 5 and Raid 10 setups with LVM groups and LVM volumes. No problem was seen regarding Grub2 and systemd.

Which repositories can be directly upgraded via ${releasever}?

Central repositories for the upgrade are

https://download.opensuse.org/update/leap/15.4/oss/
http://download.opensuse.org/update/leap/15.4/backports/
http://download.opensuse.org/update/leap/15.4/sle/

Their 15.3 precedessors must be active. You need to refresh them and replace the “15.3” in their addresses by “${releasever}”. See my post named above for the required steps. Aside of the central repositories I kept some others active during the upgrade process (with ${releasever}). Worth to name:

https://download.opensuse.org/repositories/LibreOffice:/7.5/openSUSE_Leap_15.4/
https://download.nvidia.com/opensuse/leap/15.4
https://developer.download.nvidia.com/compute/cuda/repos/opensuse15/x86_64
https://ftp.fau.de/packman/suse/openSUSE_Leap_15.4/
https://download.opensuse.org/repositories/mozilla/openSUSE_Leap_15.4/
https://download.opensuse.org/repositories/devel:/languages:/php/openSUSE_Leap_15.4/

These repositories cause no problems as their URLs for 15.3 and 15.4 only show a difference regarding their release version but not their position in the URL-tree below “download.opensuse.org”. However, some other repositories, like those for X2GO, graphics, network, security have changed their position in the URL-resource-tree and must be reconfigured after the upgrade – e.g with the help of YaST or zypper.

Nvidia driver and the new kernel

On Leap 15.3 I had used the latest Nvidia driver from the usual Nvidia community repository

download.nvidia.com/opensuse/leap/${releasever}

The active RPM was replaced by the one from the Leap 15.4 repo during upgrade resulting in the RPM “nvidia-compute-G06”, version 525.89.02-lp154.5.1. Both the transition to the respective Nvidia kernel module for the new kernel 5.14.21 and the module’s inclusion in the system’s startup procedures by dracut went without any faults.

The upgraded system directly started into a SDDM login screen. There as a mess regarding my screen order (see a forthcoming post), but this did not hinder a proper login. Afterward KDE/Plasma (in combination with Xorg) started on the three screens attached to my PC. Though I feel that the startup of KDE/Plasma for Xorg takes a more time than on Leap 15.3. But I could see no major problems. GLX applications were running as expected.

VMware Workstation WS 16.2.5 works!

As I already saw with a Leap 15.4 installation on a laptop: VMware WS 16.2.5 works with Leap 15.4 and Kernel 5.14.21. Actually, already version WS 16.2.3 is compatible.

KVM/Qemu works with installed virtual systems!

The transition of KV, qemu und libvitrtd went smoothly. virt-manager works and starts installed virtual systems as expected:

KDE – plasma-systemmonitor and respective widgets do not show information on Nvidia graphics card and not all information on network cards

At least on my system with KDE Plasma 5.24.4 (KDE Framework 5.90) I saw major deficits regarding apps and widgets which should monitor the status of system resources:

  • Problem 1 – no information about the status of the Nvidia GPU: Neither the application “systemmonitor” nor related widgets can deliver information about the status of the NVidia card (temperature, VRAM usage, fan) – with the (useless) exception of the type of the card. See th eimage in the next section. (Note that lm_sensors do not help in this case as the HW sensors did not detect the Nvidia card either.)
  • Problem 2 – no information about virtual network devices: In addition the KDE/Plasma widgets were not able to display all data for network interfaces correctly. E.g. interfaces like lo, bridges and other virtual devices were not even listed. And thus on a more complex configuration with virtualization you will not see your IP addresses with the Plasma-tools.

Regarding the second point: The image below shows that the dialog to set up network monitoring only offers to select an active physical device, in my case eth1. But no virtual devices which are available, too.

And here the devices nmon lists up:

These are clearly Plasma problems as other basic Linux and desktop applications perform much better for the named resources. Someone who works with Machine Learning and with virtualized Linux installations needs to watch the GPU (fans and temperature) and traffic across network devices – virtualized or real. This does not seem to be something we can get from fancy KDE/Plasma widgets today. So, we have to look out for alternatives.

Alternatives to watch the temperature of Nvidia GPUs

One prominent example which covers more than the present KDE widgets for system monitoring is the good old “gkrellm“. I love it, really! Compact, with a lot of already integrated default “sensors” and compatible to lm_sensors! And it delivers me the GPU temperature.

Another desktop tool with a somewhat old fashioned presentation of graphics is psensor. You can get a binary for Leap 15.4 from the “home”-repository of plasmaaregataos.

https://download.opensuse.org/repositories/home:/plasmaregataos/15.4/

Interesting, how much this tool finds out about the Nvidia card (where Plasma tools fail totally).

And then we have, of course, Nvidia’s own tools: nvidia-settings (graphical) and nvidia-smi (ASCII):

Note that the command to periodically update the nvidia-smi information is:

watch -n0.1 nvidia-smi

Alternatives to watch the data traffic across all network devices – real and virtual

Once again I have to name gkrellm. It provides a lot of information on real and virtual network interface.

Regarding a graphical information on virtual network interfaces we can also use a relatively old KDE tool, namely ksysguard. Besides real network interfaces it detects bridges and KVM or Vmware based virtual devices and tracks their load.

Hint for those who worry about the discrepancy with the screenshot for nmon above: Some devices which nmon showed were not available at the time of the screenshot of ksysguard as certain KVM virtual machines and networks were not yet started. Later ksysguard detects other devices, too:

Based on ncurses you can also use the network part of “nmon” to get information on transfer rates across all defined network interfaces. If you need to go more into details a very good ASCII tool is “iptraf-ng“. “iftop” is also cool, but to watch multiple interfaces in parallel you have to use the command “iftop -i” on multiple terminal windows.

So, dear Plasma developers: When can we expect a fancy Plasma applications or widgets which reproduce and combine the “sensoric” abilities of gkrellm, psensor, ksysguard and iptraf-ng?

Conclusion

The upgrade from Leap 15.3 to Leap 15.4 posed no major problems on my PC. A relatively old VMware WS 16.2.x still worked and there were no problems regarding KVM/qemu. The new Nvidia driver could be compiled without any problems for the new kernel during upgrade and was directly integrated into the system by dracut. The new KDE/Plasma version started without problems on Xorg. Deficits of Plasma widgets for monitoring system devices were obvious; but this is not something we can blame Opensuse for.

In the next post of this series

Opensuse Leap 15.4 on a PC – II – Plasma, Gnome, flatpak, Libreoffice and others on (X)Wayland?

I will have a look at KDE/Plasma started on Xwayland.

Happy working with Opensuse Leap 15.4 and stay tuned …

 

Tensors in ML are not tensors used in theoretical physics

People who have a physics education and want to start with Machine Learning [ML] often stumble across the use of the term “tensor” in typical Machine Learning frameworks. For a physicist the description of a “tensor” in ML appears strange from the very beginning – and one asks oneself: Have I mist something? Is there a secret relation and mapping between the term in physics and the term in ML?

Guys, let me warn you: The expression “tensor” in theoretical and mathematical physics has a (very) different meaning than what ML people use as tensors in ML. To say it clearly:

Tensors used in physics (e.g. in General Relativity or QFT) are not the same as tensors in ML.

For historical reasons even could even say:
The central term “tensor” in ML (e.g. by Google in Tensorflow) represents a kind of misuse of the original term and related framework developed by Ricci.

I admit that the last time I have worked professionally in physics and astrophysics is 3 decades ago. But I shall describe the major properties of physical tensors below as good as I remember them from that time. For people interested in a quick refresher on the topic I recommend the book “Tensors made easy”, 6th edition, 2018, ISBN 978-1-326-29253-9.

Tensors in theoretical physics

Physicists will regard tensors as multi-linear forms defined on a multidimensional (metrical) vector-space. It is a multi-linear function where the rank is the number of arguments accepted. A tensor is a function which is linear in all its arguments. The number of arguments it takes is called the ‘rank.’ Tensors can accept a certain number of vectors or covectors (which are 1-forms that basically introduce a scalar product on the vector space) as arguments. Most importantly:

Tensor objects follow certain rules regarding their component transformation under a change of the base vectors of the vector space. Components of tensors transform either in a contravariant or covariant form with respect to the (linear) change of base vectors.

Tensor fields can be defined on differentiable and curved affine (metrical) manifolds defined over Rn. The description of curved (affinely connected) manifolds requires coordinate systems and terms of differential geometry. Tensor equations will keep their form in case of a transformation of the coordinate system (by differentiable transformation functions defining a Jacobi matrix for old and new coordinates at each point). In the case of differential equations this requires to introduce so called covariant derivatives. This allows to keep up certain Tensor relations (including those based on differential operations) during coordinate transformations.

So, tensors in mathematical physics are bound to transformatory properties regarding the change of the base vectors of underlying vector spaces and the change of coordinate systems in complex geometries. Tensors in physics describe the property of physical objects independently of the choice of a specific coordinate system in space-time or other finite or infinite vector spaces as e.g Hilbert-spaces. The reference of the tensor definition to multi-linear forms, vectors and co-vectors reveals their complex structure and properties.

Note: Physical tensors with rank ≥ 2 have n**rank components, where n is the dimension of the underlying vector space. The rank and the vector-space define the number of components.

Tensors in ML => ML-tensors

Tensors in Machine Learning are most of all variants of multidimensional arrays. To distinguish them clearly from tensors in mathematical physics I will call them below “ML-tensors”.

I quote from the documentation of tensorflow:

“Tensors are multi-dimensional arrays with a uniform type (called a dtype). You can see all supported dtypes at tf.dtypes.DType. If you’re familiar with NumPy, tensors are (kind of) like np.arrays. All tensors are immutable like Python numbers and strings: you can never update the contents of a tensor, only create a new one.”

(Highlighting was done by me.

An introductory book to PyTorch – “PyTorch for Deep Learning”, Ian Pointer, 2021, O’Reilly – expresses the following opinion (freely summarized from the German text version): A tensor is a container for numbers. It is also accompanied by a set of rules which define transformations between tensors. The easiest way for an understanding a tensor in ML is probably to think of it as a multidimensional array.

Arrays simply allow for a specific “multidimensional” organisation of data to describe some objects of interest by basically independent variables. The important point is that these variables get ordered with respect to a few central, often apparent aspects of the object. In case of a picture two such aspects might be the spatial dimensions in terms of width and height.

Tensors in ML thus have axes or dimensions of the array to describe the organization of its components in a kind multidimensional structure. Each axis symbolizes one major aspect. Along each axis we allow for a number of distinct, indexed positions fixing the “size” of this dimension. Actually a axis corresponds to an indexed list of discrete elements; it should not be confused with an axis describing a dimension of a continuous spatial geometry mapped to RN. The axes of an array define a discrete multidimensional lattice of points at defined distances.

The number of dimensions is the so called rank of a ML-tensor. All sizes for the individual dimensions are gathered in a tuple called shape.

A ML-tensor of rank 3 can be thought of an arrangement of numbers (components) in a 3-dimensional lattice. However, the shape of this lattice can show a different number of elements for each of the three dimensions. I.e. an array of shape (50,2 16) can be a ML-tensor with values for 50x2x16 =1600 logically independent variables.

Think of a color image with a rectangular form and a certain pixel resolution (100 x 50 px). We may arrange the RGB-pixel values in form of an array with shape (100x50x3). Here we would have 15000 logically independent variables to characterize our object.

A rank alone obviously does not define the number of elements of an ML-tensor – we need the shape in addition.

Note also that the organization of the elements of a ML-tensor can be changed if required – one can e.g. transform a tensor of rank 3 with 1600 elements into one of rank 1 with the same number of elements in a certain linear ordered way. Such an ordered reorganization of a tensor’s elements into a tensor of different rank corresponds in Numpy to a reshaping of an array.

ML-tensors can in the same way as arrays be objects to certain algebraic operations with other “Ml-tensor” objects. For such operations the two involved tensors must obey certain rules regarding their shapes. There is a partial overlap of such operations with those used in Linear Algebra for linear mappings between objects of vector spaces of (different) finite dimensions. So, its no wonder that libraries for linear algebra dominate the field of ML-calculations.

Some of the required operations can be performed very fast and distributed over a series of processing units (or cores) which can work in parallel.

Note that the layers of modern Artificial Neural networks [ANNs] transform ML-tensors to ML-tensors. A change of ranks and the number of elements is possible during the operations of certain layers of ANNs.

Conclusion

ML-tensors simply are a clever arrangement of data describing objects of interest in the form of a multidimensional array. ANNs transform such ML-tensors and a change of the rank during such transformations is very common.

Tensors in physics are complex objects corresponding to multi-linear forms defined on vector spaces. Tensors in physics must fulfill certain properties regarding their components and their relations in case of linear transformations of the base vectors and in case of tensor fields in affine and metrical geometries also regarding the change of coordinate systems. Tensors keep their rank during such basic transformations.

Summary: ML-tensor are very different objects compared to tensors in mathematical physics. One should not mix them up or confuse them.

If someone of my experienced friends in physics has found some reasonable mapping of ML-tensors to multi-linear forms, please send me an email.