Kvm Using Device Plugins
As of Kubernetes 1.10, the Device Plugins API is now in beta! KubeVirt is now using this framework to provide hardware acceleration and network devices to virtual machines. The motivation behind this is that virt-launcher pods are no longer responsible for creating their own device nodes. Or stated another way: virt-launcher pods no longer require excess privileges just for the purpose of creating device nodes.
Kubernetes Device Plugin Basics
Device Plugins consist of two main parts: a server that provides devices and pods that consume them. Each plugin server is used to share a preconfigured list of devices local to the node with pods scheduled on that node. Kubernetes marks each node with the devices it’s capable of sharing, and uses the presence of such devices when scheduling pods.
Device Plugins In KubeVirt
Providing Devices
In KubeVirt virt-handler takes on the role of the device plugin server. When it starts up on each node, it registers with the Kubernetes Device Plugin API and advertises KVM and TUN devices.
apiVersion: v1
kind: Node
metadata:
...
spec:
...
status:
allocatable:
cpu: "2"
devices.kubevirt.io/kvm: "110"
devices.kubevirt.io/tun: "110"
pods: "110"
...
capacity:
cpu: "2"
devices.kubevirt.io/kvm: "110"
devices.kubevirt.io/tun: "110"
pods: "110"
...
In this case advertising 110 KVM or TUN devices is simply an arbitrary default based on the number of pods that node is limited to.
Consuming Devices
Now any pod that requests a devices.kubevirt.io/kvm
or
devices.kubevirt.io/tun
device can only be scheduled on nodes which provide
them. On clusters where KubeVirt is deployed this conveniently happens to be
all nodes in the cluster that have these physical devices, which normally means
all nodes in the cluster.
Here’s an excerpt of what the pod spec looks like in this case.
apiVersion: v1
kind: Pod
metadata:
...
spec:
containers:
- command:
- /entrypoint.sh
...
name: compute
...
resources:
limits:
devices.kubevirt.io/kvm: "1"
devices.kubevirt.io/tun: "1"
requests:
devices.kubevirt.io/kvm: "1"
devices.kubevirt.io/tun: "1"
memory: "161679432"
securityContext:
capabilities:
add:
- NET_ADMIN
privileged: false
runAsUser: 0
...
Of special note is the securityContext stanza. The only special privilege
required is the NET_ADMIN
capability! This is needed by libvirt to set up the
domain’s networking stack.