Uploaded image for project: 'Red Hat Advanced Cluster Management'
  1. Red Hat Advanced Cluster Management
  2. ACM-2305

ocm-webhook OOM at scale of ~3100 managedclusters preventing any further clusters from being installed

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Undefined Undefined
    • None
    • None
    • Server Foundation
    • False
    • Hide

      None

      Show
      None
    • False
    • No

      Description of problem:

      While deploying 3000+ SNOs with ACM and ZTP, the ocm-webhook began OOMing around ~3100 clusters which preventing the creation of clusterdeployment objects for the remaining clusters to be installed

      Test:

      • Attempted to install 3591 SNOs
      • Successfully installed 3114
      • Only Managed 2419

      Version-Release number of selected component (if applicable):

      2.7.0-DOWNSTREAM-2022-11-25-10-53-02
      OCP 4.11.13 (Hub and managedclusters)

      How reproducible:

      Steps to Reproduce:

      1.  
      2.  
      3. ...

      Actual results:

      Expected results:

      Additional info:

      The original pod spec had a limit of 256Mi.  After bumping the limit to 2Gi, remaing clusters were attempted for deployment and argocd could successfully sync the git repo to the expected objects via the ztp-site-generator.

              zxue@redhat.com ZHAO XUE
              akrzos@redhat.com Alex Krzos
              Alex Krzos Alex Krzos
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: