Nvidia Device Plugin Setup
Talos Linux Setup
Enable NVIDIA kernel modules
Before installing the device plugin, some initial steps need to be taken per Talos Documentation. Please make sure you have installed the correct system extensions through a combination of patches + the correct factory image for your use case.
example gpu-worker-patch.yaml
Quick Sanity Check
If running these commands does not produce similar output, you haven’t set up base system completely:
Create NVIDIA runtime class:
You will need to add this runtime class to pods you wish to add GPU resources to.
Adding runtimeClass to pods with common
Create nvidia-device-plugin namespace & enable privileged podsecurity
Note: This is only required if you want multiple GPU resources per physical GPU. If you are happy with 1 to 1 GPU to POD mapping, you can just create namespace, it won’t need privileges. You will need to turn off a setting below.
Install nvidia-device-plugin from kubeapps
There are notes in values.yaml, but the following defines how many resources are made per GPU:
Note: If you do not want multigpu mapping, set replicas to 1 and change the following line to false.