Leap 15.6, Nvidia driver 570 – resume from suspend to RAM not working / workaround

Hint: After some experiments and further Internet digging, this post was rewritten and supplemented on the 5th of March, 2025. Sorry for any inconvenience.
—–

Recently, I have upgraded Opensuse Leap to version 15.6 on 5 PC-systems – all with (different) Nvidia graphic cards. I use KDE/Plasma on all these systems.

My daily working system is equipped with a 4060 TI Nvidia card. Nvidia drivers of version 570.124.06-1 on this particular system came from the Nvidia CUDA repository for Opensuse system at

https://developer.download.nvidia.com/ compute/ cuda/ repos/ opensuse15/ x86_64.

I sadly must say that the named particular driver, but also the present Nvidia drivers of version 570.86.16 on other systems, are at least in their corporation with the Linux kernel (6.4.0) and other components of the present Leap 15.6, unreliable or even buggy (for KDE/Plasma):

The resume process from “Suspend to RAM” does not work reliably on any of the systems.

Continue reading

NUMA node error for Nvidia cards on Linux PCs

You may have experienced it in various contexts: CUDA, Tensorflow, gaming applications or complex 3D graphics applications may warn you that your Nvidia card is associated with an unexpected negative NUMA value. The warning often refers to a value of “-1”. And the clever application replaces this value by a default value of “0”.

The problem is particularly annoying when dealing Machine Learning, e.g. in Jupyter notebooks. There warnings may repeatedly clatter the output of some cells – e.g. during the setup of the graphics card for some ML experiments.

Besides the question why the Nvidia drivers for Linux and/or CUDA drivers do not fix this problem by detecting just one NUMA node on the system and setting the value for the card to “0”, the question for us users is how we can get rid of the warnings.

A basic idea is that we set the right value by ourselves. I have described this simple measure in the sister blog, which unfortunately still is under construction. See:
Setting NUMA node to 0 for Nvidia cards on standard Linux PCs.

There I also briefly discuss what NUMA basically is thought for – and why it normally does not affect consumer PCs.

 

Opensuse Leap 15.5 – installation of CUDA 12.3 for Machine Learning

Working with Machine Learning and Deep Neural Networks not only requires GPU drivers, but in case of Nvidia GPUs also the installation of CUDA and cuDNN. This process is always a bit tricky as additional environment variables have to be set for IPython-based Jupyterlab or classic Jupyter Notebook. On an Opensuse system one must in addition take care of the right settings in /etc/alternatives.

I have described the necessary steps in a post at “machine-learning.anracom.com“.

I hope this helps people who want to use Leap 15.5 for Machine Learning with Nvidia GPUs, Keras/Tensorflow 2 and Jupyterlab.

Important addendum 01/27/2024:
Although the combination of CUDA 12.3, cuDNN 8.9.7, Tensorflow 2.15 and Nvidia drivers 545.29.06 works regarding AI-models, there is another major problem:
Nvidia’s driver 545.29.06 is buggy – at least for Leap 15.5, KDE/Plasma with multiple screens. The bug affects Suspend-to-RAM. Suspend-to-RAM seems to work in the suspend phase, and the system also comes up afterward in a seemingly proper state of your KDE/Plasma interface (on your screens).

However, the problems begin when you want to change to another virtual screen via Ctrl-Alt-Fx. You wait and wait and wait … The same for changing the run-level or systemd target state or when you want to shut the system down. This makes Suspend-to-RAM with driver 545.29.06 impossible to use.

Recommendation:
If you have a working older Nvidia driver (e.g. a stable 535 version) do not change to 545.29.06. Unfortunately, it is a mess on a multiscreen Leap 15.5 system to return to an older driver version. The Nvidia community repository does not offer you a choice. (Why by the way ????). Downloading an older proprietary driver from Nvidia and trying to install it afterward on a console terminal (after having stopped X11 or Wayland) did not work in my case – the screens displaying the terminal changed their resolution and froze afterward. So, you may have to completely uninstall the present driver 545 completely, go back to standard VGA and then try to install an older driver via Nvidias install mechanism. As I said: It is a mess …