Opensuse Leap 15.5 – installation of CUDA 12.3 for Machine Learning

Working with Machine Learning and Deep Neural Networks not only requires GPU drivers, but in case of Nvidia GPUs also the installation of CUDA and cuDNN. This process is always a bit tricky as additional environment variables have to be set for IPython-based Jupyterlab or classic Jupyter Notebook. On an Opensuse system one must in addition take care of the right settings in /etc/alternatives.

I have described the necessary steps in a post at “machine-learning.anracom.com“.

I hope this helps people who want to use Leap 15.5 for Machine Learning with Nvidia GPUs, Keras/Tensorflow 2 and Jupyterlab.

Important addendum 01/27/2024:
Although the combination of CUDA 12.3, cuDNN 8.9.7, Tensorflow 2.15 and Nvidia drivers 545.29.06 works regarding AI-models, there is another major problem:
Nvidia’s driver 545.29.06 is buggy – at least for Leap 15.5, KDE/Plasma with multiple screens. The bug affects Suspend-to-RAM. Suspend-to-RAM seems to work in the suspend phase, and the system also comes up afterward in a seemingly proper state of your KDE/Plasma interface (on your screens).

However, the problems begin when you want to change to another virtual screen via Ctrl-Alt-Fx. You wait and wait and wait … The same for changing the run-level or systemd target state or when you want to shut the system down. This makes Suspend-to-RAM with driver 545.29.06 impossible to use.

Recommendation:
If you have a working older Nvidia driver (e.g. a stable 535 version) do not change to 545.29.06. Unfortunately, it is a mess on a multiscreen Leap 15.5 system to return to an older driver version. The Nvidia community repository does not offer you a choice. (Why by the way ????). Downloading an older proprietary driver from Nvidia and trying to install it afterward on a console terminal (after having stopped X11 or Wayland) did not work in my case – the screens displaying the terminal changed their resolution and froze afterward. So, you may have to completely uninstall the present driver 545 completely, go back to standard VGA and then try to install an older driver via Nvidias install mechanism. As I said: It is a mess …

 

Opensuse Leap 15.4 – Problems with Optimus and prime-select after updates of SW packages

Presently, I work a lot on an old laptop which has a so called Optimus combination of a dedicated Nvidia GPU and an Intel GPU coming with the main CPU-processor. “prime-select” is a tool which Opensuse includes with Leap 15.4 to provide an efficient way of controlling which GPU shall be used. As good as prime-select has worked for me on Leap 15.3 and also some time with Leap 15.4 recent updates of a variety of SW packages lead into trouble.

I had the Nvidia card active before the SW updates. After a cold restart of the system it did no longer start the SDDM display manager on the default systemd target. This happened even when the updates did not directly affect the kernel or the Nvidia kernel modules.

The problem always had to do with bbswitch turning off the Nvidia device when the system switched to the default graphical target. And with a turned off Nvidia graphics device the Nvidia drivers can not be loaded.

So some SW updates lead to a change of the configuration prime-select had set up before the updates. The stupid thing is that it is not quite so simple to get things back to work. To try to us “init 3” to go to a console interface on a non-graphical target and then use “prime-select nvidia” plus a subsequent “init 5” on the command line does not work. You do not change the wrong bbswitch actions that way. You can also turn bbswitch off by “tee /proc/acpi/bbswitch <<< OFF". And then load the Nvidia driver successfully. But trying to afterward switch to the standard graphical target invokes bbswitch again in the wrong way. It is a bit of a mess. The following steps seem to work to get back to normal operation again:

  • Step 1: Use “init 3” on a console terminal.
  • Step 2: Use the command “prime-select intel”.
  • Step 3: Restart your system. It should boot now into the graphical target based on the i915 intel GPU driver.
  • Step 4: Ignore any information from a prime-select icon. It shows you a plainly wrong info that you are using Nvidia.
  • Step 5: Log in as root on a root terminal window. Switch bbswitch off (e.g. by the command given above). Load the Nvidia module by “modprobe nvidia”. Check via lsmod that it is successfully loaded.
  • Step 6: Type in “prime-select nvidia”.
  • Step 7: Log out from your graphical interface.
  • Step 8: Check that SDDM or whatever display manager is started with bbswitch not shutting down the Nvidia card. Log inn with the Nvidia card active.
  • Step 9: Check that the Nvidia driver is still loaded on a root terminal window. Then issue “mkinitrd” and restart your Leap 15.4 system.

Afterward using the “prime-select intel” or “prime-select nvidia” commands at the command line of a root terminal window, a logout from the graphical desktop and a login again via the restarted graphical display manager switches correctly between the cards.

However, the prime-select applet gives you wrong information when the intel card is active. And it does not give you the chance to switch back to the Nvidia card again. Its stupid, but no major problem as long as the basic prime-select command does its job on the command line.

Hope this helps people having to work with Opensuse on an Optimus system.

 

Opensuse Leap 15.4 on a PC – II – Plasma, Gnome, flatpak, Libreoffice and others on (X)Wayland?

In the last post of this series

Upgrade to Opensuse Leap 15.4 – I – a look at repositories, Nvidia, Vmware WS, KVM, Plasma widgets

we saw that an upgrade from Leap Opensuse 15.3 to Leap 15.4 is a relatively smooth operation. After the basic upgrade I wanted to look a bit at an interesting detail – namely (X)Wayland. And got surprised – positively and negatively. This post summarizes some of my experiences.

Nvidia and Wayland

For a long time it was almost impossible to use Wayland in combination with Nvidia and KDE. Which mostly was the fault of Nvidia. See an Heise article on this topic here. See also the experience of a Gnome user here – although I do not share his bad experience with a X11-based KDE Plasma on Nvidia. For years I have not seen a realistic chance for a productive use of Wayland on my PCs and laptops with Nvidia-cards. KDE PLasma did not work at all on Wayland. Also with Gnome I experienced terrible difficulties. But Nvidia has improved its support for Wayland significantly in 2022, starting with driver version 470. For Nvidia driver version 525 we would expect some stability.

Continue reading