monitoring

LowReadyVirtControllersCount

Meaning

This alert fires when one or more virt-controller pods are running, but none of these pods has been in the Ready state for the last 5 minutes.

A virt-controller device monitors the custom resource definitions (CRDs) of a virtual machine instance (VMI) and manages the associated pods. The device create pods for VMIs and manages the lifecycle of the pods. The device is critical for cluster-wide virtualization functionality.

Impact

This alert indicates that a cluster-level failure might occur, which would cause actions related to VM lifecycle management to fail. This notably includes launching a new VMI or shutting down an existing VMI.

Diagnosis

  1. Set the NAMESPACE environment variable:

    $ export NAMESPACE="$(kubectl get kubevirt -A -o custom-columns="":.metadata.namespace)"
    
  2. Verify a virt-controller device is available:

    $ kubectl get deployment -n $NAMESPACE virt-controller -o jsonpath='{.status.readyReplicas}'
    
  3. Check the status of the virt-controller deployment:

    $ kubectl -n $NAMESPACE get deploy virt-controller -o yaml
    
  4. Obtain the details of the virt-controller deployment to check for status conditions, such as crashing pods or failures to pull images:

    $ kubectl -n $NAMESPACE describe deploy virt-controller
    
  5. Check if any problems occurred with the nodes. For example, they might be in a NotReady state:

    $ kubectl get nodes
    

Mitigation

This alert can have multiple causes, including the following:

Try to identify the root cause and resolve the issue.

If you cannot resolve the issue, see the following resources: