Uploaded image for project: 'OpenShift API Server'
  1. OpenShift API Server
  2. API-1647 apiserver: scalability
  3. API-1648

ocp 4.14 kube-apiserver dramatic increase in rate of terminated node watchers

XMLWordPrintable

    • False
    • None
    • False

      Description of problem:

      In OCP 4.14, we are observing an excessive rate of terminated watchers on node resources only.
      
      It is our suspicion that this is related to increased CPU activity on both kube-apiserver and all client components that use node watchers, including: machine-config-daemon, ovnkube-node, and kube-apiserver pods (in decreasing order).
      
      

      Version-Release number of selected component (if applicable):

      4.14.0-0.nightly-2023-08-08-222204
      Note: This nightly was specifically chosen to isolate changes from 4.13 to 4.14 without the introduction of ovn-ic in current 4.14 nightlies.
      

      How reproducible:

      100% 
      I first observed this in 120 and 250 node clusters, but as of today, I have seen it in a 24 node cluster as well.
      

      Steps to Reproduce:

      1. Create a 4.14 cluster.
      2. Scale to 120 workers
      3. Run `kube-burner ocp node-density-cni --log-level=info --qps=35 --burst=35 --gc=true --es-index=ripsaw-kube-burner --pods-per-node=50`
      

      Actual results:

      Observe the kube-apiserver has high rate of the following log message:
      cache_watcher.go:180] Forcing nodes watcher close due to unresponsiveness: key: "/minions", labels: "", fields: "". len(c.input) = 10, len(c.result) = 1
      
      Observe ovnkube-node has high rate of watchers being closed:
      reflector.go:788] k8s.io/client-go/informers/factory.go:150: Watch close - *v1.Node total 746 items received
      
      Observe similar rates of watch events and terminated watchers for node resources using prometheus metrics:
      watch events: sum( irate( apiserver_watch_cache_events_received_total{resource=~"nodes"}[2m] ) ) by (resource)
      terminated watchers: sum( irate( apiserver_terminated_watchers_total{resource=~"nodes"}[2m] ) ) by (resource)
      
      We observe average(events_received)=155 and average(terminated_watchers)=99.7
      

      Expected results:

      Terminated watchers for node resources are far less frequent, like what we see with pods, which have average((events_received))=1300 and average(rate(terminated_watchers))=3.17
      
      

      Additional info:

      
      After some collaboration with Lukasz, I understand there is a cache size that can be increased at the expense of memory.
      Since we have plenty of memory (30GiB used of 120GiB) and we are running out of 16 cores of CPU, I think this is a valid time to consider increasing that cache size.
      https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/apiserver/pkg/storage/cacher/watch_cache.go#L657
      
      There are components where we see an increase in PATCH nodes requests leading to update in WATCH requests and those will be addressed in other bugs.
      

            lszaszki@redhat.com Lukasz Szaszkiewicz
            ancollin@redhat.com Andrew Collins
            Ke Wang Ke Wang
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: