Reference: http://on-demand.gputechconf.com/gtc/2014/presentations/S4253-tools-tips-for-managing-a-gpu-cluster.pdf sudo chmod 755 /etc/init.d/after.localInitializing a GPU in runlevel 3 Most clusters operate at runlevel 3 so you should initialize the GPU explicitly in an init script At minimum: — Load kernel modules – nvidia + nvidia_uvm (in CUDA 6) — Create devices with mknod Optional steps: — Configure compute mode — Set driver persistence — Set power limits Set GPU power limits Power consumption limits can be set with NVML/ nvidia - smi Set on a per - GPU basis Useful in power - constrained environments nvidia - smi – pl <power in watts> Settings don’t persist across reboots set this in your init script Requires driver persistence ====================================== sudo vi /etc/init.d/after.local add: /usr/bin/nvidia-smi -c 3 /usr/bin/nvidia-smi -pm 1 /usr/bin/nvidia-smi -pl 120 (optional, used on T5500 GTX970 120watt limit) due to power constraints |
Wiki >