This document aims to help users that are not familiar with metrics exposed by all the KubeVirt components. All metrics documented here are auto-generated in each component repository and gathered here. They reflect and describe exactly what is being exposed.
The number of allocatable nodes in the cluster. Type: Gauge.
The total number of requests to deprecated KubeVirt APIs. Type: Counter.
Indicates whether the Software Emulation is enabled in the configuration. Type: Gauge.
Amount of active Console connections, broken down by namespace and vmi name. Type: Gauge.
Version information. Type: Gauge.
The delta between the pod with highest memory working set or rss and its requested memory for each container, virt-controller, virt-handler, virt-api and virt-operator. Type: Gauge.
The number of nodes in the cluster that have the devices.kubevirt.io/kvm resource available. Type: Gauge.
The number of VMs in the cluster by namespace. Type: Gauge.
Amount of active portforward tunnels, broken down by namespace and vmi name. Type: Gauge.
Client side rate limiter latency in seconds. Broken down by verb and URL. Type: Histogram.
Request latency in seconds. Broken down by verb and URL. Type: Histogram.
Number of HTTP requests, partitioned by status code, method, and host. Type: Counter.
Amount of active USB redirection connections, broken down by namespace and vmi name. Type: Gauge.
The number of virt-api pods that are up. Type: Gauge.
Indication for an operating virt-controller. Type: Gauge.
The number of virt-controller pods that are ready. Type: Gauge.
Indication for a virt-controller that is ready to take the lead. Type: Gauge.
The number of virt-controller pods that are up. Type: Gauge.
The number of virt-handler pods that are up. Type: Gauge.
The number of virt-operator pods that are leading. Type: Gauge.
Indication for an operating virt-operator. Type: Gauge.
The number of virt-operator pods that are ready. Type: Gauge.
Indication for a virt-operator that is ready to take the lead. Type: Gauge.
The number of virt-operator pods that are up. Type: Gauge.
The current available memory of the VM containers based on the rss. Type: Gauge.
The current available memory of the VM containers based on the working set. Type: Gauge.
The total number of VMs created by namespace and virt-api pod, since install. Type: Counter.
The total number of VMs created by namespace, since install. Type: Counter.
Allocated disk size of a Virtual Machine in bytes, based on its PersistentVolumeClaim. Includes persistentvolumeclaim (PVC name), volume_mode (disk presentation mode: Filesystem or Block), and device (disk name). Type: Gauge.
Virtual Machine last transition timestamp to error status. Type: Counter.
Information about Virtual Machines. Type: Gauge.
Virtual Machine last transition timestamp to migrating status. Type: Counter.
Virtual Machine last transition timestamp to paused/stopped status. Type: Counter.
Resources limits by Virtual Machine. Reports memory and CPU limits. Type: Gauge.
Resources requested by Virtual Machine. Reports memory and CPU requests. Type: Gauge.
Virtual Machine last transition timestamp to running status. Type: Counter.
Virtual Machine last transition timestamp to starting status. Type: Counter.
Total CPU time spent in system mode. Type: Counter.
Total CPU time spent in all modes (sum of both vcpu and hypervisor usage). Type: Counter.
Total CPU time spent in user mode. Type: Counter.
Total VM filesystem capacity in bytes. Type: Gauge.
Used VM filesystem capacity in bytes. Type: Gauge.
Information about VirtualMachineInstances. Type: Gauge.
Virtual Machine Instance last API connection timestamp. Including VNC, console, portforward, SSH and usbredir connections. Type: Gauge.
Estimation of the memory amount required for virt-launcher’s infrastructure components (e.g. libvirt, QEMU). Type: Gauge.
Current balloon size in bytes. Type: Gauge.
Amount of usable memory as seen by the domain. This value may not be accurate if a balloon driver is in use or if the guest OS does not initialize all assigned pages Type: Gauge.
The amount of memory that is being used to cache I/O and is available to be reclaimed, corresponds to the sum of Buffers
+ Cached
+ SwapCached
in /proc/meminfo
. Type: Gauge.
The amount of memory in bytes allocated to the domain. The memory
value in domain xml file. Type: Gauge.
The number of page faults when disk IO was required. Page faults occur when a process makes a valid access to virtual memory that is not available. When servicing the page fault, if disk IO is required, it is considered as major fault. Type: Counter.
The number of other page faults, when disk IO was not required. Page faults occur when a process makes a valid access to virtual memory that is not available. When servicing the page fault, if disk IO is NOT required, it is considered as minor fault. Type: Counter.
Resident set size of the process running the domain. Type: Gauge.
The total amount of data read from swap space of the guest in bytes. Type: Gauge.
The total amount of memory written out to swap space of the guest in bytes. Type: Gauge.
The amount of memory left completely unused by the system. Memory that is available but used for reclaimable caches should NOT be reported as free. Type: Gauge.
The amount of memory which can be reclaimed by balloon without pushing the guest system to swap, corresponds to ‘Available’ in /proc/meminfo. Type: Gauge.
Amount of used
memory as seen by the domain. Type: Gauge.
The total Guest OS data processed and migrated to the new VM. Type: Gauge.
The remaining guest OS data to be migrated to the new VM. Type: Gauge.
The rate of memory being dirty in the Guest OS. Type: Gauge.
The rate at which the memory is being transferred. Type: Gauge.
Indicates if the VMI migration failed. Type: Gauge.
Histogram of VM migration phase transitions duration from creation time in seconds. Type: Histogram.
Indicates if the VMI migration succeeded. Type: Gauge.
Number of current pending migrations. Type: Gauge.
Number of current running migrations. Type: Gauge.
Number of current scheduling migrations. Type: Gauge.
Total network traffic received in bytes. Type: Counter.
Total network received error packets. Type: Counter.
The total number of rx packets dropped on vNIC interfaces. Type: Counter.
Total network traffic received packets. Type: Counter.
[Deprecated] Total number of bytes sent and received. Type: Counter.
Total network traffic transmitted in bytes. Type: Counter.
Total network transmitted error packets. Type: Counter.
The total number of tx packets dropped on vNIC interfaces. Type: Counter.
Total network traffic transmitted packets. Type: Counter.
Number of VMI CPU affinities to node physical cores. Type: Gauge.
Indication for a VirtualMachine that its eviction strategy is set to Live Migration but is not migratable. Type: Gauge.
Indication for the total number of VirtualMachineInstance workloads that are not running within the most up-to-date version of the virt-launcher environment. Type: Gauge.
Sum of VMIs per phase and node. phase
can be one of the following: [Pending
, Scheduling
, Scheduled
, Running
, Succeeded
, Failed
, Unknown
]. Type: Gauge.
Histogram of VM phase transitions duration from creation time in seconds. Type: Histogram.
Histogram of VM phase transitions duration from deletion time in seconds. Type: Histogram.
Histogram of VM phase transitions duration between different phases in seconds. Type: Histogram.
The addresses of a VirtualMachineInstance. This metric provides the address of an available network interface associated with the VMI in the ‘address’ label, and about the type of address, such as internal IP, in the ‘type’ label. Type: Gauge.
Total storage flush requests. Type: Counter.
Total time spent on cache flushing. Type: Counter.
Total number of I/O read operations. Type: Counter.
Total number of I/O write operations. Type: Counter.
Total time spent on read operations. Type: Counter.
Total number of bytes read from storage. Type: Counter.
Total time spent on write operations. Type: Counter.
Total number of written bytes. Type: Counter.
Amount of time spent by each vcpu waiting in the queue instead of running. Type: Counter.
Total amount of time spent in each state by each vcpu (cpu_time excluding hypervisor time). Where id
is the vcpu identifier and state
can be one of the following: [OFFLINE
, RUNNING
, BLOCKED
]. Type: Counter.
Amount of time spent by each vcpu while waiting on I/O. Type: Counter.
Returns the total number of virtual machine disks restored from the source virtual machine. Type: Gauge.
Returns the amount of space in bytes restored from the source virtual machine. Type: Gauge.
Returns the labels of the persistent volume claims that are used for restoring virtual machines. Type: Gauge.
Returns the timestamp of successful virtual machine snapshot. Type: Gauge.
Amount of active VNC connections, broken down by namespace and vmi name. Type: Gauge.
The number of CDI clone pods with high restart count. Type: Gauge.
The clone progress in percentage. Type: Counter.
CDI install ready. Type: Gauge.
DataImportCron has an outdated import. Type: Gauge.
Number of DataVolumes pending for default storage class to be configured. Type: Gauge.
The number of CDI import pods with high restart count. Type: Gauge.
The import progress in percentage. Type: Counter.
Progress of volume population. Type: Counter.
CDI operator status. Type: Gauge.
Progress of volume population. Type: Counter.
StorageProfiles
info labels: storageclass
, provisioner
, complete
indicates if all storage profiles recommended PVC settings are complete, default
indicates if it’s the Kubernetes default storage class, virtdefault
indicates if it’s the default virtualization storage class, rwx
indicates if the storage class supports ReadWriteMany
, smartclone
indicates if it supports snapshot or CSI based clone, degraded
indicates it is not optimal for virtualization. Type: Gauge.
The number of CDI upload server pods with high restart count. Type: Gauge.
Total count of KubeMacPool manager pods deployed by CNAO CR. Type: Gauge.
KubeMacpool is deployed by CNAO CR. Type: Gauge.
CNAO CR Ready. Type: Gauge.
Total count of duplicate KubeMacPool MAC addresses. Type: Gauge.
Total count of running KubeMacPool manager pods. Type: Gauge.
Total count of running CNAO operators. Type: Gauge.
The total number of running VMIs, labeled with node, instance type, preference and guest OS information. Type: Gauge.
The increase in the number of common templates restored by the operator back to their original state, over the last hour. Type: Gauge.
The total number of common templates restored by the operator back to their original state. Type: Counter.
Set to 1 if the reconcile process of all operands completes with no errors, and to 0 otherwise. Type: Gauge.
The total number of ssp-operator pods reconciling with no errors. Type: Gauge.
The total number of running ssp-operator pods. Type: Gauge.
The increase in the number of rejected template validators, over the last hour. Type: Gauge.
The total number of rejected template validators. Type: Counter.
The total number of running virt-template-validator pods. Type: Gauge.
[ALPHA] VM with RBD mounted Block volume (without rxbounce option set). Type: Gauge.
HPP CR Ready. Type: Gauge.
The number of running hostpath-provisioner-operator pods. Type: Gauge.
HPP pool path sharing a filesystem with OS, fix to prevent HPP PVs from causing disk pressure and affecting node operation. Type: Gauge.
Sum of CPU core requests for all running virt-launcher VMIs across the entire Kubevirt cluster. Type: Gauge.
Monitors resources for potential problems. Type: Gauge.
Indicates whether the HyperConverged custom resource exists (1) or not (0). Type: Gauge.
Indicates whether the optional descheduler is not properly configured (1) to work with KubeVirt or not (0). Type: Gauge.
Count of out-of-band modifications overwritten by HCO. Type: Counter.
Indicates whether the underlying cluster is single stack IPv6 (1) or not (0). Type: Gauge.
Indicates whether the system health status is healthy (0), warning (1), or error (2), by aggregating the conditions of HCO and its secondary resources. Type: Gauge.
Count of unsafe modifications in the HyperConverged annotations. Type: Gauge.
Indicates whether HCO and its secondary resources health status is healthy (0), warning (1) or critical (2), based both on the firing alerts that impact the operator health, and on kubevirt_hco_system_health_status metric. Type: Gauge.