monitoring

VirtOperatorRESTErrorsHigh

Meaning

This alert fires when more than 5% of the REST calls in virt-operator pods failed in the last 60 minutes. This usually indicates the virt-operator pods cannot connect to the API server.

This error is frequently caused by one of the following problems:

Impact

Cluster-level actions, such as upgrading and controller reconciliation, might be delayed.

However, customer workloads, such as virtual machines (VMs) and VM instances (VMIs), are not likely to be affected.

Diagnosis

  1. Set the NAMESPACE environment variable:

    $ export NAMESPACE="$(kubectl get kubevirt -A -o custom-columns="":.metadata.namespace)"
    
  2. Check the status of the virt-operator pods:

    $ kubectl -n $NAMESPACE get pods -l kubevirt.io=virt-operator
    
  3. Check the virt-operator logs for error messages when connecting to the API server:

    $ kubectl -n $NAMESPACE logs <virt-operator>
    
  4. Obtain the details of the virt-operator pod:

    $ kubectl -n $NAMESPACE describe pod <virt-operator>
    

Mitigation

If you cannot resolve the issue, see the following resources: