Skip to main content

HW-Acceleration in LXC container

Update Proxmox to latest
apt update && apt dist-upgrade -y

Reboot after the upgrade
reboot

Install sudo, git, gcc, make and header files
apt-get install sudo git gcc make pve-headers-$(uname -r)

Install latest NVIDIA driver
Using list here as reference: GitHub - keylase/nvidia-patch

mkdir /opt/nvidia
cd /opt/nvidia
wget https://download.nvidia.com/XFree86/Linux-x86_64/418.56/NVIDIA-Linux-x86_64-<version>.run
chmod +x NVIDIA-Linux-x86_64-<version>.run
./NVIDIA-Linux-x86_64-<version>.run --no-questions --ui=none --disable-nouveau

Driver will create /etc/modprobe.d/nvidia-installer-disable-nouveau.conf and disable the nouveau driver. Verify this by checking the contents of the created .conf file

more /etc/modprobe.d/nvidia-installer-disable-nouveau.conf

#generated by nvidia-installer
blacklist nouveau
options nouveau modeset=0

Reboot to disable nouveau drivers
reboot

Run the NVIDIA installer, which will now complete the driver install.
/opt/nvidia/NVIDIA-Linux-x86_64-418.56.run --no-questions --ui=none

Check that nvidia-smi works now
nvidia-smi


+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.56       Driver Version: 418.56       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 980 Ti  Off  | 00000000:43:00.0 Off |                  N/A |
| 23%   61C    P0    72W / 275W |      0MiB /  6075MiB |      2%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Create/update the modules.conf file for boot
nano /etc/modules-load.d/modules.conf

# /etc/modules-load.d/modules.conf
# /etc/modules: kernel modules to load at boot time.
#
# This file contains the names of kernel modules that should be loaded
# at boot time, one per line. Lines beginning with "#" are ignored.
nvidia
nvidia_uvm

Generate the initramfs image with the new modules.conf
update-initramfs -u

Create rules to load the drivers on boot for both NVIDIA and nvidia_uvm
nano /etc/udev/rules.d/70-nvidia.rules

# /etc/udev/rules.d/70-nvidia.rules
# Create /nvidia0, /dev/nvidia1 … and /nvidiactl when nvidia module is loaded
KERNEL=="nvidia", RUN+="/bin/bash -c '/usr/bin/nvidia-smi -L'"
#
# Create the CUDA node when nvidia_uvm CUDA module is loaded
KERNEL=="nvidia_uvm", RUN+="/bin/bash -c '/usr/bin/nvidia-modprobe -c0 -u'"

Install GitHub - NVIDIA/nvidia-persistenced: NVIDIA driver persistence daemon 60
git clone https://github.com/NVIDIA/nvidia-persistenced.git
cd nvidia-persistenced/init
./install.sh


Checking for common requirements...
  sed found in PATH?  Yes
  useradd found in PATH?  Yes
  userdel found in PATH?  Yes
  id found in PATH?  Yes
Common installation/uninstallation supported

Creating sample System V script... done.
Creating sample systemd service file... done.
Creating sample Upstart service file... done.

Checking for systemd requirements...
  /usr/lib/systemd/system directory exists?  No
  /etc/systemd/system directory exists?  Yes
  systemctl found in PATH?  Yes
systemd installation/uninstallation supported

Installation parameters:
  User  : nvidia-persistenced
  Group : nvidia-persistenced
  systemd service installation path : /etc/systemd/system

Adding user 'nvidia-persistenced' to group 'nvidia-persistenced'... done.
Installing sample systemd service nvidia-persistenced.service... done.
Enabling nvidia-persistenced.service... done.
Starting nvidia-persistenced.service... done.

systemd service successfully installed.

Double check that the service is running and enabled
systemctl status nvidia-persistenced

nvidia-persistenced.service - NVIDIA Persistence Daemon
   Loaded: loaded (/etc/systemd/system/nvidia-persistenced.service; enabled; vendor preset: enabled)
   Active: active (running) since Thu 2019-04-04 13:45:44 CDT; 38s ago
  Process: 13356 ExecStart=/usr/bin/nvidia-persistenced --user nvidia-persistenced (code=exited, status=0/SUCCESS)
 Main PID: 13362 (nvidia-persiste)
    Tasks: 1 (limit: 19660)
   Memory: 996.0K
      CPU: 262ms
   CGroup: /system.slice/nvidia-persistenced.service
           └─13362 /usr/bin/nvidia-persistenced --user nvidia-persistenced

Apr 04 13:45:44 ripper systemd[1]: Starting NVIDIA Persistence Daemon...
Apr 04 13:45:44 ripper nvidia-persistenced[13362]: Started (13362)
Apr 04 13:45:44 ripper systemd[1]: Started NVIDIA Persistence Daemon.

Reboot and verify all NVIDIA devices come up
reboot
ls -l /dev/nv*

crw-rw-rw- 1 root root 195,   0 Apr  4 13:49 /dev/nvidia0
crw-rw-rw- 1 root root 195, 255 Apr  4 13:49 /dev/nvidiactl
crw-rw-rw- 1 root root 195, 254 Apr  4 13:49 /dev/nvidia-modeset
crw-rw-rw- 1 root root 235,   0 Apr  4 13:49 /dev/nvidia-uvm
crw-rw-rw- 1 root root 235,   1 Apr  4 13:49 /dev/nvidia-uvm-tools

Patch nvidia driver to remove max encoding sessions
cd /opt/nvidia
git clone https://github.com/keylase/nvidia-patch.git
cd nvidia-patch
./patch.sh

Detected nvidia driver version: 418.56
Attention! Backup not found. Copy current libnvcuvid.so to backup.
751706615c652c4725d48c2e0aaf53be1d9553d5  /opt/nvidia/libnvidia-encode-backup/libnvcuvid.so.418.56
ee47ac207a3555adccad593dbcda47d8c93091c0  /usr/lib/x86_64-linux-gnu/libnvcuvid.so.418.56
Patched!

In Proxmox create a new LXC or edit the container conf file (in my case, container 100) and add the LXC group and mounts to the end of the .conf file
nano /etc/pve/lxc/100.conf

lxc.cgroup.devices.allow: c 195:* rwm
lxc.cgroup.devices.allow: c 235:* rwm
lxc.mount.entry: /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
lxc.mount.entry: /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-modeset dev/nvidia-modeset none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file

Note 1: the two groups above (195 and 235) come from the ls -l /dev/nv* output earlier

Note 2: In some cases the group for -uvm and -uvm-tools will change on reboot. In my case they toggle between 235 and 511. If you find this happening add all the groups that you see occuring to the allow list to prevent needing to change the .conf file constantly. For example, mine currently says:
lxc.cgroup.devices.allow: c 195:* rwm
lxc.cgroup.devices.allow: c 235:* rwm
lxc.cgroup.devices.allow: c 511:* rwm

Start the LXC container and download the same NVIDIA driver to the container.
Run the installation, but do not install the driver. The installer asks me if I want to install libglvnd
because I have an incomplete installation. I have always said “Don’t install libglvnd”. Everything
else I answer yes to and ignore the warnings.
mkdir /opt/nvidia
cd /opt/nvidia
wget https://download.nvidia.com/XFree86/Linux-x86_64/418.56/NVIDIA-Linux-x86_64-<version>.run
chmod +x NVIDIA-Linux-x86_64-<version>.run
./NVIDIA-Linux-x86_64-<version>.run --no-kernel-module

Run nvidia-smi to check the installation, also check that the Nvidia devices are present.
nvidia-smi

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.56       Driver Version: 418.56       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 980 Ti  Off  | 00000000:43:00.0 Off |                  N/A |
|  0%   60C    P8    35W / 275W |      1MiB /  6075MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

ls -l /dev/nv*

crw-rw-rw- 1 nobody nogroup 195, 254 Apr  4 19:20 /dev/nvidia-modeset
crw-rw-rw- 1 nobody nogroup 235,   0 Apr  4 19:20 /dev/nvidia-uvm
crw-rw-rw- 1 nobody nogroup 235,   1 Apr  4 19:20 /dev/nvidia-uvm-tools
crw-rw-rw- 1 nobody nogroup 195,   0 Apr  4 19:20 /dev/nvidia0
crw-rw-rw- 1 nobody nogroup 195, 255 Apr  4 19:20 /dev/nvidiactl