monitoring

VirtOperatorDown

Meaning

This alert fires when no virt-operator pod in the Running state has been detected for 10 minutes.

The virt-operator is the first Operator to start in a cluster. Its primary responsibilities include the following:

The virt-operator deployment has a default replica of 2 pods.

Impact

This alert indicates a failure at the level of the cluster. Critical cluster-wide management functionalities, such as certification rotation, upgrade, and reconciliation of controllers, might not be available.

The virt-operator is not directly responsible for virtual machines (VMs) in the cluster. Therefore, its temporary unavailability does not significantly affect VM workloads.

Diagnosis

  1. Set the NAMESPACE environment variable:

    $ export NAMESPACE="$(kubectl get kubevirt -A -o custom-columns="":.metadata.namespace)"
    
  2. Check the status of the virt-operator deployment:

    $ kubectl -n $NAMESPACE get deploy virt-operator -o yaml
    
  3. Obtain the details of the virt-operator deployment:

    $ kubectl -n $NAMESPACE describe deploy virt-operator
    
  4. Check the status of the virt-operator pods:

    $ kubectl get pods -n $NAMESPACE -l=kubevirt.io=virt-operator
    
  5. Check for node issues, such as a NotReady state:

    $ kubectl get nodes
    

Mitigation

Based on the information obtained during the diagnosis procedure, try to identify the root cause and resolve the issue.

If you cannot resolve the issue, see the following resources: