Docker debian-cuda

Debian 9 with CUDA Toolkit and cuDNN

Docker 1.x logo

docker-debian-cuda is a minimal Docker image built from Debian 9 (amd64) with CUDA Toolkit and cuDNN using only Debian packages.

Although the nvidia-docker tool can run CUDA inside Docker images, it uses yet another wrapper command and is based on Ubuntu images. To make the whole process more transparent, we explicitly expose GPU devices and build from official Debian images. All installations are performed through the Debian package manager, also because the official Nvidia CUDA Toolkit does not support Debian without hacks.

Open source project:

Available tags:

  • 8.0.44-2_5.1.5-1_375.20-4, 8.0_5.1, latest [2016-12-21]: CUDA Toolkit (8.0.44-2) + cuDNN (5.1.5-1) + CUDA library (375.20-4) (Dockerfile)
  • 7.5.18-4_5.1.3_361.45.18-2, 7.5_5.1 [2016-09-19]: CUDA Toolkit (7.5.18-4) + cuDNN (5.1.3) + CUDA library (361.45.18-2) (Dockerfile)
  • 7.5.18-2 [2016-07-20]: CUDA Toolkit (7.5.18-2) + cuDNN (4.0.7) + CUDA library (352.79-8)

Usage

Host system requirements (eg. Debian 8 or 9):

  • CUDA-capable GPU
  • nvidia-kernel-dkms (same as CUDA library)
  • optionally nvidia-cuda-mps, nvidia-smi

To utilize your GPUs this Docker image needs access to your /dev/nvidia* devices, like:

$ docker run -it --rm $(ls /dev/nvidia* | xargs -I{} echo '--device={}') gw000/debian-cuda

Host system

List of devices that should be present on the host system:

$ ll /dev/nvidia*
crw-rw---- 1 root video 250,   0 Jul 13 15:56 /dev/nvidia-uvm
crw-rw---- 1 root video 250,   1 Jul 13 15:56 /dev/nvidia-uvm-tools
crw-rw---- 1 root video 195,   0 Jul 13 15:56 /dev/nvidia0
crw-rw---- 1 root video 195, 255 Jul 13 15:56 /dev/nvidiactl

In case /dev/nvidia0 and /dev/nvidiactl are not present, ensure the kernel module nvidia is automatically loaded, properly configured and optimized, and there is a udev rule to create the devices:

$ echo 'nvidia' > /etc/modules-load.d/nvidia.conf
$ cat > /etc/udev/rules.d/70-nvidia.rules << __EOF__
KERNEL=="nvidia", RUN+="/bin/bash -c '/usr/bin/nvidia-smi -L && /bin/chmod 0660 /dev/nvidia* && /bin/chgrp video /dev/nvidia*'"
__EOF__

For OpenCL support the devices /dev/nvidia-uvm and /dev/nvidia-uvm-tools are needed. Ensure the kernel module nvidia-uvm is automatically loaded, and add a custom udev rule to create the device:

$ echo 'nvidia-uvm' > /etc/modules-load.d/nvidia-uvm.conf
$ cat > /etc/udev/rules.d/70-nvidia-uvm.rules << __EOF__
KERNEL=="nvidia_uvm", RUN+="/bin/bash -c '/usr/bin/nvidia-modprobe -c0 -u && /bin/chmod 0660 /dev/nvidia-uvm* && /bin/chgrp video /dev/nvidia-uvm*'"
__EOF__

If you would like to monitor real-time temperatures on your host system use something like:

$ watch -n 5 'nvidia-smi; echo; sensors; for hdd in /dev/sd?; do echo -n "$hdd  "; smartctl -A $hdd | grep Temperature_Celsius; done'

In case your Nvidia kernel driver and CUDA library versions differ an error appears in kernel messages (dmesg) or using nvidia-smi inside the container. Possible solutions:

  • upgrade your Nvidia kernel driver on the host directly from Debian 9 packages: nvidia-kernel-dkms, nvidia-alternative, libnvidia-ml1, nvidia-smi
  • upgrade your Nvidia kernel driver on the host by compiling it yourself
  • inject the correct version of CUDA library into the container (if it is installed on the host) with:
$ docker run -it --rm $(ls /dev/nvidia* | xargs -I{} echo '--device={}') $(ls /usr/lib/x86_64-linux-gnu/libcuda.* | xargs -I{} echo '-v {}:{}:ro') gw000/debian-cuda

Feedback

If you encounter any bugs or have feature requests, please file them in the issue tracker or even develop it yourself and submit a pull request over GitHub.

License

Copyright © 2016 gw0 [http://gw.tnode.com/] <>

This library is licensed under the GNU Affero General Public License 3.0+ (AGPL-3.0+). Note that it is mandatory to make all modifications and complete source code of this library publicly available to any user.