You are viewing documentation for Cozystack next, which is currently in beta. For the latest stable version, see the v1.4 documentation.
Running VMs with GPU Passthrough
This section demonstrates how to deploy virtual machines (VMs) with GPU passthrough using Cozystack. First, we’ll deploy the GPU Operator to configure the worker node for GPU passthrough Then we will deploy a KubeVirt VM that requests a GPU.
By default, to provision a GPU Passthrough, the GPU Operator will deploy the following components:
- VFIO Manager to bind
vfio-pcidriver to all GPUs on the node. - Sandbox Device Plugin to discover and advertise the passthrough GPUs to kubelet.
- Sandbox Validator to validate the other operands.
Prerequisites
- A Cozystack cluster with at least one GPU-enabled node.
- kubectl installed and cluster access credentials configured.
1. Install the GPU Operator
Follow these steps:
Label the worker node explicitly for GPU passthrough workloads:
kubectl label node <node-name> --overwrite nvidia.com/gpu.workload.config=vm-passthroughEnable the GPU Operator in your Platform Package by adding it to the enabled packages list:
kubectl patch packages.cozystack.io cozystack.cozystack-platform --type=json \ -p '[{"op": "add", "path": "/spec/components/platform/values/bundles/enabledPackages/-", "value": "cozystack.gpu-operator"}]'This will deploy the components (operands).
Ensure all pods are in a running state and all validations succeed with the sandbox-validator component:
kubectl get pods -n cozy-gpu-operatorExample output (your pod names may vary):
NAME READY STATUS RESTARTS AGE ... nvidia-sandbox-device-plugin-daemonset-4mxsc 1/1 Running 0 40s nvidia-sandbox-validator-vxj7t 1/1 Running 0 40s nvidia-vfio-manager-thfwf 1/1 Running 0 78s
To verify the GPU binding, access the node using kubectl node-shell -n cozy-system -x or kubectl debug node and run:
lspci -nnk -d 10de:
The vfio-manager pod will bind all GPUs on the node to the vfio-pci driver. Example output:
3b:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:2236] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:1482]
Kernel driver in use: vfio-pci
86:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:2236] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:1482]
Kernel driver in use: vfio-pci
The sandbox-device-plugin will discover and advertise these resources to kubelet. In this example, the node shows two A10 GPUs as available resources:
kubectl describe node <node-name>
Example output:
...
Capacity:
...
nvidia.com/GA102GL_A10: 2
...
Allocatable:
...
nvidia.com/GA102GL_A10: 2
...
device and device_name columns from the
PCI IDs database.
For example, the database entry for A10 reads 2236 GA102GL [A10], which results in a resource name nvidia.com/GA102GL_A10.2. KubeVirt is wired automatically
When cozystack.gpu-operator is in bundles.enabledPackages, Cozystack mirrors the chosen GPU variant into the KubeVirt Custom Resource for you. There is no kubectl edit kubevirt step.
Specifically, the platform injects:
HostDevicesintospec.configuration.developerConfiguration.featureGates(current KubeVirt splits this from theGPUgate; the admission webhook rejectsdomain.devices.hostDeviceswithout it).- A starter
spec.configuration.permittedHostDevices.pciHostDevicestable covering common NVIDIA datacenter GPUs — Hopper (H100, H200), Ada Lovelace (L4, L40, L40S), Ampere (A100 PCIe/SXM, A40, A30, A10), Turing (T4), Volta (V100, V100S). PCI vendor:device pairs are stable;resourceNameslugs follow the<arch>_<model>_<form>_<mem>conventionnvidia-sandbox-device-pluginv25.x emits (e.g.nvidia.com/GA102GL_A10).externalResourceProvider: trueis set on every entry because the resources are advertised by the sandbox plugin, not by KubeVirt’s in-tree device plugin.
Verify the resulting CR:
kubectl -n cozy-kubevirt get kubevirt kubevirt -o yaml \
| yq '.spec.configuration | {featureGates: .developerConfiguration.featureGates, permittedHostDevices: .permittedHostDevices}'
Extending or replacing the NVIDIA defaults
If your cluster ships a GPU not in the default table, or your nvidia-sandbox-device-plugin version emits a different resourceName (check with kubectl describe node <node> | grep nvidia.com/), extend the defaults via platform values:
# Platform Package values
gpu:
# Append (default) — your entries land alongside the NVIDIA table.
# Set to true to drop the NVIDIA table entirely (useful for non-NVIDIA-only
# clusters or strict allowlists). With replaceDefaults: true and an empty
# list below, the rendered CR carries no permittedHostDevices block at all
# and the admission webhook rejects every GPU VM — supply your own list.
replaceDefaults: false
permittedHostDevices:
pciHostDevices:
- pciVendorSelector: "10DE:2236"
resourceName: nvidia.com/GA102GL_A10
externalResourceProvider: true
Manual Package-CR override path
If you opt out of bundle management and hand-craft a cozystack.gpu-operator Package CR directly (to apply overrides the bundle does not expose — driver settings, custom node selectors, validator / dcgmExporter tweaks), the platform does NOT auto-wire HostDevices or permittedHostDevices into the KubeVirt CR. In that flow, mirror the bundle behaviour by also creating a cozystack.kubevirt Package CR with components.kubevirt.values.extraFeatureGates: [HostDevices] and the appropriate permittedHostDevices block. The manual Package-CR override path takes precedence over the bundle render whenever both exist.
3. Create a Virtual Machine
We are now ready to create a VM.
Create a sample virtual machine using the following VMI specification that requests the
nvidia.com/GA102GL_A10resource.vmi-gpu.yaml:
--- apiVersion: apps.cozystack.io/v1alpha1 appVersion: '*' kind: VirtualMachine metadata: name: gpu namespace: tenant-example spec: running: true instanceProfile: ubuntu instanceType: u1.medium systemDisk: image: ubuntu storage: 5Gi storageClass: replicated gpus: - name: nvidia.com/GA102GL_A10 cloudInit: | #cloud-config password: ubuntu chpasswd: { expire: False }kubectl apply -f vmi-gpu.yamlExample output:
virtualmachines.apps.cozystack.io/gpu createdVerify the VM status:
kubectl get vmiNAME AGE PHASE IP NODENAME READY virtual-machine-gpu 73m Running 10.244.3.191 luc-csxhk-002 TrueLog in to the VM and confirm that it has access to GPU:
virtctl console virtual-machine-gpuExample output:
Successfully connected to vmi-gpu console. The escape sequence is ^] vmi-gpu login: ubuntu Password: ubuntu@virtual-machine-gpu:~$ lspci -nnk -d 10de: 08:00.0 3D controller [0302]: NVIDIA Corporation GA102GL [A10] [10de:26b9] (rev a1) Subsystem: NVIDIA Corporation GA102GL [A10] [10de:1851] Kernel driver in use: nvidia Kernel modules: nvidiafb, nvidia_drm, nvidia
GPU Sharing for Virtual Machines
GPU passthrough assigns an entire physical GPU to a single VM. To share one GPU between multiple VMs, you need NVIDIA vGPU.
vGPU (Virtual GPU)
NVIDIA vGPU uses mediated devices (mdev) to create virtual GPUs assignable to VMs. This is the only production-ready solution for GPU sharing between VMs.
Requirements:
- NVIDIA vGPU license (commercial, purchased from NVIDIA)
- NVIDIA vGPU Manager installed on host nodes
Open-Source vGPU (Experimental)
NVIDIA is developing open-source vGPU support for the Linux kernel. Once merged, this could enable GPU sharing without a license.
- Status: RFC stage, not merged into mainline kernel
- Supports Ada Lovelace and newer (L4, L40, etc.)
- References: Phoronix announcement, kernel patches