This alert fires when more than 5% of the REST calls in virt-operator
pods
failed in the last 60 minutes. This usually indicates the virt-operator
pods
cannot connect to the API server.
This error is frequently caused by one of the following problems:
The API server is overloaded, which causes timeouts. To verify if this is the case, check the metrics of the API server, and view its response times and overall calls.
The virt-operator
pod cannot reach the API server. This is commonly caused
by DNS issues on the node and networking connectivity issues.
Cluster-level actions, such as upgrading and controller reconciliation, might be delayed.
However, customer workloads, such as virtual machines (VMs) and VM instances (VMIs), are not likely to be affected.
Set the NAMESPACE
environment variable:
$ export NAMESPACE="$(kubectl get kubevirt -A -o custom-columns="":.metadata.namespace)"
Check the status of the virt-operator
pods:
$ kubectl -n $NAMESPACE get pods -l kubevirt.io=virt-operator
Check the virt-operator
logs for error messages when connecting to the API server:
$ kubectl -n $NAMESPACE logs <virt-operator>
Obtain the details of the virt-operator
pod:
$ kubectl -n $NAMESPACE describe pod <virt-operator>
If the virt-operator
pod cannot connect to the API server, delete the pod to
force a restart:
$ kubectl delete -n <install-namespace> <virt-operator>
If you cannot resolve the issue, see the following resources: