Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-57222

Unnecessary churn with OLMv0 operatorgroup clusterrole management

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Important
    • None
    • None
    • Rejected
    • None
    • Done
    • Bug Fix
    • Hide
      * Before this update, when an Operator supplied more than one API in the Operator group namespace, {olmv0} made unnecessary update calls to the cluster roles that were created for the Operator group. As a result, these unnecessary calls caused churn for ectd and the API server. With this update, {olmv0} does not make unnecessary update calls to the cluster role objects in Operator groups. (link:https://issues.redhat.com/browse/OCPBUGS-57222[OCPBUGS-57222])
      Show
      * Before this update, when an Operator supplied more than one API in the Operator group namespace, {olmv0} made unnecessary update calls to the cluster roles that were created for the Operator group. As a result, these unnecessary calls caused churn for ectd and the API server. With this update, {olmv0} does not make unnecessary update calls to the cluster role objects in Operator groups. (link: https://issues.redhat.com/browse/OCPBUGS-57222 [ OCPBUGS-57222 ])
    • None
    • None
    • None
    • None

      Description of problem:

      When there are one or more operators installed in a namespace with an OperatorGroup that targets all namespaces and where the operators provide a combined total of at least 2 APIs, OLMv0 sends ClusterRole updates to the APIserver with aggregation rule changes where the only change is the order of the aggregation rule selectors. This happens whenever the OperatorGroup is reconciled, which happens when other namespaces are created or deleted, among other triggers.
      
      This causes unnecessary churn with etcd writes and invalidation of auth caches in openshift-apiserver, which leads to yet more churn.    

      Version-Release number of selected component (if applicable):

      4.19.0-rc.5    

      How reproducible:

      Always    

      Steps to Reproduce:

          1. Get a clusterbot 4.19.0-rc5 cluster
          2. Install several operators in the global-operators namespace
          3. Start a watch for the clusterrole with the name prefix "olm.og.global-operators.admin-" (e.g. oc get clusterrole olm.og.global-operators.admin-3gjDVezhGPF6RBtOOpjEpDpKqO39v3NK8r4hmc -w -o yaml)
          4. Create and delete namespaces multiple times
          5. Observe from the watch that there are changes to the clusterrole and that the only change is to the order of the selectors in the aggregation rule.

      Actual results:

      Writes to the clusterrole occur due to changing order of selectors.

      Expected results:

      Writes to the clusterrole do not occur, because the order of selectors is deterministic. 

      Additional info:

          

              jlanford@redhat.com Joe Lanford
              jlanford@redhat.com Joe Lanford
              None
              None
              Jian Zhang Jian Zhang
              None
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated: