Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-41235

Failed to list secrets when a large number exist on the cluster [4.15]

XMLWordPrintable

    • Important
    • None
    • False
    • Hide

      None

      Show
      None
    • Hide
      * Previously, the Cloud Credential Operator (CCO) would produce an error when starting or restarting when there were a large number of secrets in the cluster fetched at once. With this release, CCO fetches the secrets in batches of 100 and the issue is resolved. (link:https://issues.redhat.com/browse/OCPBUGS-41235[*OCPBUGS-41235*])
      __________________
      *Cause*: A large number of secrets in the cluster causes the API to timeout when they are fetched in a single call.
      *Consequence*: Cloud credential operator throws an error on startup and restarts.
      *Fix*: Cloud credential operator now pulls the list in smaller batches of 100.
      *Result*: Cloud credential operator no longer errors when there is a large number of secrets in the cluster.
      Show
      * Previously, the Cloud Credential Operator (CCO) would produce an error when starting or restarting when there were a large number of secrets in the cluster fetched at once. With this release, CCO fetches the secrets in batches of 100 and the issue is resolved. (link: https://issues.redhat.com/browse/OCPBUGS-41235 [* OCPBUGS-41235 *]) __________________ *Cause*: A large number of secrets in the cluster causes the API to timeout when they are fetched in a single call. *Consequence*: Cloud credential operator throws an error on startup and restarts. *Fix*: Cloud credential operator now pulls the list in smaller batches of 100. *Result*: Cloud credential operator no longer errors when there is a large number of secrets in the cluster.
    • Bug Fix
    • Done

      This is a clone of issue OCPBUGS-41234. The following is the description of the original issue:

      This is a clone of issue OCPBUGS-41233. The following is the description of the original issue:

      This is a clone of issue OCPBUGS-39531. The following is the description of the original issue:

      -> While upgrading the cluster from 4.13.38 -> 4.14.18, it is stuck on CCO, clusterversion is complaining about

      "Working towards 4.14.18: 690 of 860 done (80% complete), waiting on cloud-credential".

      While checking further we see that CCO deployment is yet to rollout.

      -> ClusterOperator status.versions[name=operator] isn't a narrow "CCO Deployment is updated", it's "the CCO asserts the whole CC component is updated", which requires (among other things) a functional CCO Deployment. Seems like you don't have a functional CCO Deployment, because logs have it stuck talking about asking for a leader lease. You don't have Kube API audit logs to say if it's stuck generating the Lease request, or waiting for a response from the Kube API server.

            jstuever@redhat.com Jeremiah Stuever
            openshift-crt-jira-prow OpenShift Prow Bot
            Jianping Shu Jianping Shu
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: