Monitoring KubeVirt VMs from the inside
Monitoring KubeVirt VMs from the inside
This blog post will guide you on how to monitor KubeVirt Linux based VirtualMachines with Prometheus node-exporter. Since node_exporter will run inside the VM and expose metrics at an HTTP endpoint, you can use this same guide to expose custom applications that expose metrics in the Prometheus format.
Environment
This set of tools will be used on this guide:
- Helm v3 - To deploy the Prometheus-Operator.
- minikube - Will provide us a k8s cluster, you are free to choose any other k8s provider though.
- kubectl - To deploy different k8s resources
- virtctl - to interact with KubeVirt VirtualMachines, can be downloaded from the KubeVirt repo.
Deploy Prometheus Operator
Once you have your k8s cluster, with minikube or any other provider, the first step will be to deploy the Prometheus Operator. The reason is that the KubeVirt CR, when installed on the cluster, will detect if the ServiceMonitor CR already exists. If it does, then it will create ServiceMonitors configured to monitor all the KubeVirt components (virt-controller, virt-api, and virt-handler) out-of-the-box.
Although monitoring KubeVirt itself is not covered in this guide, it is a good practice to always deploy the Prometheus Operator before deploying KubeVirt.
To deploy the Prometheus Operator, you will need to create its namespace first, e.g. monitoring
:
kubectl create ns monitoring
Then deploy the operator in the new namespace:
helm fetch stable/prometheus-operator
tar xzf prometheus-operator*.tgz
cd prometheus-operator/ && helm install -n monitoring -f values.yaml kubevirt-prometheus stable/prometheus-operator
After everything is deployed, you can delete everything that was downloaded by helm:
cd ..
rm -rf prometheus-operator*
One thing to keep in mind is the release name we added here: kubevirt-prometheus
. The release name will be used when declaring our ServiceMonitor
later on..
Deploy KubeVirt Operators and KubeVirt CustomResources
Alright, the next step will be deploying KubeVirt itself. We will start with its operator.
We will fetch the latest version, then use kubectl create
to deploy the manifest directly from Github::
export KUBEVIRT_VERSION=$(curl -s https://api.github.com/repos/kubevirt/kubevirt/releases | grep tag_name | grep -v -- - | sort -V | tail -1 | awk -F':' '{print $2}' | sed 's/,//' | xargs)
kubectl create -f https://github.com/kubevirt/kubevirt/releases/download/${KUBEVIRT_VERSION}/kubevirt-operator.yaml
Before deploying the KubeVirt CR, make sure that all kubevirt-operator replicas are ready, you can do that with:
kubectl rollout status -n kubevirt deployment virt-operator
After that, we can deploy KubeVirt and wait for all it’s components to get ready in a similar manner:
kubectl create -f https://github.com/kubevirt/kubevirt/releases/download/${KUBEVIRT_VERSION}/kubevirt-cr.yaml
kubectl rollout status -n kubevirt deployment virt-api
kubectl rollout status -n kubevirt deployment virt-controller
kubectl rollout status -n kubevirt daemonset virt-handler
If we want to monitor VMs that can restart, we want our node-exporter to be persisted and, thus, we need to set up persistent storage for them. CDI will be the component responsible for that, so we will deploy it’s operator and custom resource as well. As always, waiting for the right components to get ready before proceeding:
export CDI_VERSION=$(curl -s https://github.com/kubevirt/containerized-data-importer/releases/latest | grep -o "v[0-9]\.[0-9]*\.[0-9]*")
kubectl create -f https://github.com/kubevirt/containerized-data-importer/releases/download/$CDI_VERSION/cdi-operator.yaml
kubectl rollout status -n cdi deployment cdi-operator
kubectl create -f https://github.com/kubevirt/containerized-data-importer/releases/download/$CDI_VERSION/cdi-cr.yaml
kubectl rollout status -n cdi deployment cdi-apiserver
kubectl rollout status -n cdi deployment cdi-uploadproxy
kubectl rollout status -n cdi deployment cdi-deployment
Deploying a VirtualMachine with persistent storage
Alright, cool. We have everything we need now. Let’s setup the VM.
We will start with the PersistenVolume
’s required by CDI’s DataVolume resources. Since I’m using minikube with no dynamic storage provider, I’ll be creating 2 PVs with a reference to the PVCs that will claim them. Notice claimRef
in each of the PVs.
apiVersion: v1
kind: PersistentVolume
metadata:
name: example-volume
spec:
storageClassName: ""
claimRef:
namespace: default
name: cirros-dv
accessModes:
- ReadWriteOnce
capacity:
storage: 2Gi
hostPath:
path: /data/example-volume/
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: example-volume-scratch
spec:
storageClassName: ""
claimRef:
namespace: default
name: cirros-dv-scratch
accessModes:
- ReadWriteOnce
capacity:
storage: 2Gi
hostPath:
path: /data/example-volume-scratch/
With the persistent storage in place, we can create our VM with the following manifest:
apiVersion: kubevirt.io/v1alpha3
kind: VirtualMachine
metadata:
name: monitorable-vm
spec:
runStrategy: Always
template:
metadata:
name: monitorable-vm
labels:
prometheus.kubevirt.io: "node-exporter"
spec:
domain:
resources:
requests:
memory: 1024Mi
devices:
disks:
- disk:
bus: virtio
name: my-data-volume
volumes:
- dataVolume:
name: cirros-dv
name: my-data-volume
dataVolumeTemplates:
- metadata:
name: "cirros-dv"
spec:
source:
http:
url: "https://download.cirros-cloud.net/0.4.0/cirros-0.4.0-x86_64-disk.img"
pvc:
storageClassName: ""
accessModes:
- ReadWriteOnce
resources:
requests:
storage: "2Gi"
Notice that KubeVirt’s VirtualMachine resource has a VirtualMachine template and a dataVolumeTemplate. On the VirtualMachine template, it is important noticing that we named our VM monitorable-vm
, and we will use this name to connect to its console with virtctl
later on. The label we’ve added, prometheus.kubevirt.io: "node-exporter"
, is also important, since we’ll use it when configuring Prometheus to scrape the VM’s node-exporter
On dataVolumeTemplate, it is important noticing that we named the PVC cirros-dv
and the DataVolume resource will create 2 PVCs with that, cirros-dv
and cirros-dv-scratch
. Notice that cirros-dv
and cirros-dv-scratch
are the names referenced on our PersistentVolume manifests. The names must match for this to work.
Installing the node-exporter inside the VM
Once the VirtualMachineInstance is running, we can connect to its console using virtctl console monitorable-vm
. If user and password are required, provide your credentials accordingly. If you are using the same disk image from this guide, the user and password are cirros
and gocubsgo
respectively.
The following script will install node-exporter and configure the VM to always start the exporter when booting:
curl -LO -k https://github.com/prometheus/node_exporter/releases/download/v1.0.1/node_exporter-1.0.1.linux-amd64.tar.gz
gunzip -c node_exporter-1.0.1.linux-amd64.tar.gz | tar xopf -
./node_exporter-1.0.1.linux-amd64/node_exporter &
sudo /bin/sh -c 'cat > /etc/rc.local <<EOF
#!/bin/sh
echo "Starting up node_exporter at :9100!"
/home/cirros/node_exporter-1.0.1.linux-amd64/node_exporter 2>&1 > /dev/null &
EOF'
sudo chmod +x /etc/rc.local
P.S.: If you are using a different base image, please configure node-exporter to start at boot time accordingly
Configuring Prometheus to scrape the VM’s node-exporter
To configure Prometheus to scrape the node-exporter (or other applications) is really simple. All we need is to create a new Service
and a ServiceMonitor
:
apiVersion: v1
kind: Service
metadata:
name: monitorable-vm-node-exporter
labels:
prometheus.kubevirt.io: "node-exporter"
spec:
ports:
- name: metrics
port: 9100
targetPort: 9100
protocol: TCP
selector:
prometheus.kubevirt.io: "node-exporter"
---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: kubevirt-node-exporters-servicemonitor
namespace: monitoring
labels:
prometheus.kubevirt.io: "node-exporter"
release: monitoring
spec:
namespaceSelector:
any: true
selector:
matchLabels:
prometheus.kubevirt.io: "node-exporter"
endpoints:
- port: metrics
interval: 15s
Let’s break this down just to make sure we set up everything right. Starting with the Service
:
spec:
ports:
- name: metrics
port: 9100
targetPort: 9100
protocol: TCP
selector:
prometheus.kubevirt.io: "node-exporter"
On the specification, we are creating a new port named metrics
that will be redirected to every pod labeled with prometheus.kubevirt.io: "node-exporter"
, at port 9100, which is the default port number for the node-exporter.
apiVersion: v1
kind: Service
metadata:
name: monitorable-vm-node-exporter
labels:
prometheus.kubevirt.io: "node-exporter"
We are also labeling the Service itself with prometheus.kubevirt.io: "node-exporter"
, that will be used by the ServiceMonitor
object.
Now let’s take a look at our ServiceMonitor
specification:
spec:
namespaceSelector:
any: true
selector:
matchLabels:
prometheus.kubevirt.io: "node-exporter"
endpoints:
- port: metrics
interval: 15s
Since our ServiceMonitor will be deployed at the monitoring
namespace, but our service is at the default
namespace, we need namespaceSelector.any=true
.
We are also telling our ServiceMonitor that Prometheus needs to scrape endpoints from services labeled with prometheus.kubevirt.io: "node-exporter"
and which ports are named metrics
. Luckily, that’s exactly what we did with our Service
!
One last thing to keep an eye on. Prometheus configuration can be set up to watch multiple ServiceMonitors. We can see which ServiceMonitors our Prometheus is watching with the following command:
# Look for Service Monitor Selector
kubectl describe -n monitoring prometheuses.monitoring.coreos.com monitoring-prometheus-oper-prometheus
Make sure our ServiceMonitor has all labels required by Prometheus’ Service Monitor Selector
. One common selector is the release name that we’ve set when deploying our Prometheus with helm!
Testing
You can do a quick test by port-forwarding Prometheus web UI and executing some PromQL:
kubectl port-forward -n monitoring prometheus-monitoring-prometheus-oper-prometheus-0 9090:9090
To make sure everything is working, access localhost:9090/graph
and execute the PromQL up{pod=~"virt-launcher.*"}
. Prometheus should return data that is being collected from monitorable-vm
’s node-exporter.
You can play around with virtctl
, stop and starting the VM to see how the metrics behave. You will notice that when stopping the VM with virtctl stop monitorable-vm
, the VirtualMachineInstance is killed and, thus, so is it’s pod. This will result with our service not being able to find the pod’s endpoint and then it will be removed from Prometheus’ targets.
With this behavior, alerts like the one below won’t work since our target is literally gone, not down.
- alert: KubeVirtVMDown
expr: up{pod=~"virt-launcher.*"} == 0
for: 1m
labels:
severity: warning
annotations:
summary: KubeVirt VM {{ $labels.pod }} is down.
BUT, if the VM is constantly crashing without being stopped, the pod won’t be killed and the target will still be monitored. Node-exporter will never start or will go down constantly alongside the VM, so an alert like this might work:
- alert: KubeVirtVMCrashing
expr: up{pod=~"virt-launcher.*"} == 0
for: 5m
labels:
severity: critical
annotations:
summary: KubeVirt VM {{ $labels.pod }} is constantly crashing before node-exporter starts at boot.
Conclusion
In this blog post we used node-exporter to expose metrics out of a KubeVirt VM. We also configured Prometheus Operator to collect these metrics. This illustrates how to bring Kubernetes monitoring best practices with applications running inside KubeVirt VMs.