-
Story
-
Resolution: Unresolved
-
Normal
-
None
-
None
-
5
-
False
-
None
-
False
-
-
While we worked on OCPBUGS-5505 where we made Upgradeable check throttling period deterministic, we also considered the option to reduce the period, to 1 minute or even lower. We were not sure whether doing so is safe though, because there is some evidence that CVO uses cluster apiserver more intensively than necessary, not going through informers (and hence their local cache):
Re: Lala's "why not squeeze harder?" https://github.com/openshift/cluster-version-operator/pull/882#discussion_r1069679453 , it is probably worth peeking at audit logs for one of the CI runs, to see if our ClusterOperator call rate is sustainable. I'd have expected our ClusterOperator access to flow through an informer, so higher nominal-access would be absorbed by our local cache and not make it out to the API server. But https://redhat-internal.slack.com/archives/C01CQA76KMX/p1672954955591829?thread_ts=1672946726.268369&cid=C01CQA76KMX suggests that at least some call sites are using direct calls, and not the informers, and we may not want to go too hard if these Upgradeable checks are actually direct calls
W. Trevor King
$ zgrep -h '"username":"system:serviceaccount:openshift-cluster-version:default"' kube-apiserver/*.log.gz | jq -r '.requestURI' | sort | uniq -c | sort -n | tail 32 /apis/apiextensions.k8s.io/v1/customresourcedefinitions/performanceprofiles.performance.openshift.io 33 /apis/config.openshift.io/v1/infrastructures/cluster 34 /apis/admissionregistration.k8s.io/v1/validatingwebhookconfigurations/controlplanemachineset.machine.openshift.io 34 /apis/admissionregistration.k8s.io/v1/validatingwebhookconfigurations/performance-addon-operator 34 /apis/batch/v1/namespaces/openshift-operator-lifecycle-manager/cronjobs/collect-profiles 36 /apis/image.openshift.io/v1/namespaces/openshift/imagestreams/driver-toolkit 135 /apis/config.openshift.io/v1/proxies/cluster 326 /api/v1/namespaces/openshift-cluster-version/configmaps/version 489 /apis/coordination.k8s.io/v1/namespaces/openshift-cluster-version/leases/version 530 /apis/config.openshift.io/v1/clusteroperatorspossibly we have a ClusterOperator consumer that needs to get wired up to our existing informer...
This story intends to optimize the API usage of the Upgradeable check and status synchronization in CVO and find the optimal (lower) throttling period for its status sync.
- relates to
-
OCPBUGS-7766 admin ack test sometimes fails because upgradable=false shows up too slowly after upgrade
- New