monitoring

HighNodeCPUFrequency

Meaning

This alert fires when a CPU frequency on a node exceeds 80% of the maximum frequency for more than 5 minutes.

Impact

High CPU frequency can indicate:

Diagnosis

  1. Identify the affected node and CPU:
    kubectl get nodes
    
  2. Check current CPU frequency on the node:
    kubectl debug node/<node-name> -it --image=registry.redhat.io/ubi8/ubi
    

    Then run inside the debug pod:

    cat /proc/cpuinfo | grep -i "cpu mhz"
    
  3. Monitor CPU utilization and temperature:
    kubectl top nodes
    
    kubectl top pods --all-namespaces --sort-by=cpu
    

    Check system temperature (if available):

    sensors
    
  4. Review node resource allocation:
    kubectl describe node <node-name>
    
  5. Check for CPU-intensive workloads:
    ps aux --sort=-%cpu | head -20
    

Mitigation

  1. Immediate actions:
    • Monitor the CPU temperature to ensure it’s within safe limits
    • Check if the high frequency is due to legitimate high CPU demand
    • Verify CPU settings if needed
  2. If caused by high CPU utilization:
    • Identify and analyze CPU-intensive pods
    • Consider redistributing workloads across nodes
    • Evaluate pod resource limits and requests
    • Scale horizontally if needed
  3. If thermal throttling is suspected:
    • Check system cooling and ventilation
    • Monitor ambient temperature
    • Consider reducing workload temporarily
    • Verify proper thermal management settings

If you cannot resolve the issue, see the following resources: