I have a PC with Ubuntu server installed and Nvidia GPUs attached. I have Googled on and off for a while trying to learn how to overclock the GPUs without success.
Finally, after gleaning bits and parts from different sources, I got power limit and overclocking working.
Here are the steps required
nvidia-settings
need a monitor attached in order to work. For headless linux, one needs a virtual monitor. After messing with xorg conf manually for a while, I found and forked andyljones/coolgpus to automatically attach a virtual monitor for each GPU.
coolgpu
was originally developed to override fan curves but I run it in debug only and rely on the default fan management.
With virtual monitor attached, you can now use nvidia-smi
to set power limits
sudo nvidia-smi -i 0 -pl 150
This command set the power limit to 150 watt for the GPU attached at index 0.
You can verify the power limit is active via
nvidia-smi
Which should display the latest power limit.
Another good command to check is
nvidia-smi -q -d PERFORMANCE
Which not only tells you whether the GPU has a software power limit, but also whether the GPU performance is currently limited by other factors as hardware.
Also with virtual monitor attached, you can overclock via nvidia-settings
tool
DISPLAY=:0 nvidia-settings -a '[gpu:0]/GPUMemoryTransferRateOffsetAllPerformanceLevels=350
I tried to set overclock settings for specific performance level without any luck and only GPUMemoryTransferRateOffsetAllPerformanceLevels
worked for me. The environment variable Display
is used to specify the GPU you want overclock - ie :0
for the first GPU, :1
for the 2nd etc.
You can verify the settings took effect by querying the setting
DISPLAY=:0 nvidia-settings -q '[gpu:0]/GPUMemoryTransferRateOffset[2]'