Uploaded image for project: 'Red Hat Advanced Cluster Management'
  1. Red Hat Advanced Cluster Management
  2. ACM-18029

policy agents start later and later on the new clusters when more and more clusters are deployed

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Critical Critical
    • ACM 2.13.0
    • ACM 2.13.0
    • GRC
    • 2
    • False
    • Hide

      None

      Show
      None
    • False
    • GRC Sprint 2025-04
    • Important
    • None

      Description of problem:

      Code changes between 2.13.0-DOWNSTREAM-2025-02-12-14-07-11 and 

      2.13.0-DOWNSTREAM-2025-02-10-06-00-59 cause performance down grade. As shown in the attached charts for the test runs with these two builds we can see that after around 1000 SNOs are deployed, as the number of cluster increased from 1000 to 3500, the time between the cluster becomes managed and become compliant gets longer and longer in the 2nd chart for the run with 2.13.0-DOWNSTREAM-2025-02-12-14-07-11 

      I've checked the last cluster that was deployed which is vm00399, the grc addon logsgovernance-policy-framework-5bd97cdfff-nksrn.log shows that it didn't take long time for the cluster to be compliant after the addon is created, the problem is the addon was started much later that the other  ACM addons
      below are the commands shows the first line of the addons for the laster SNO was deployed, vm00399 and the first SNO that was deployed, vm00528.  We can see that the policy addons are almost started at the same time with other addons  on vm00528, but they are started about 25 minutes later than the other addons on vm00399.

      The code change between these two ACM build can be found here: https://gitlab.cee.redhat.com/acm-cicd/acm-dsb/-/raw/acm-2.13/snapshots/2025-02-12-14-07-11/changelog-2025-02-10-06-00-59-2025-02-12-14-07-11.out

       oc --kubeconfig=/root/hv-vm/kc/vm00399/kubeconfig logs -n open-cluster-management-agent-addon                cluster-proxy-proxy-agent-79dd4cf99b-jhxpl   | head 1
      head: cannot open '1' for reading: No such file or directory
      Defaulted container "proxy-agent" out of: proxy-agent, addon-agent, service-proxy
      oc --kubeconfig=/root/hv-vm/kc/vm00399/kubeconfig logs -n open-cluster-management-agent-addon                cluster-proxy-proxy-agent-79dd4cf99b-jhxpl   | head -1
      Defaulted container "proxy-agent" out of: proxy-agent, addon-agent, service-proxy
      I0219 19:20:30.696762       1 options.go:124] AgentCert set to "/etc/tls/tls.crt".
       oc --kubeconfig=/root/hv-vm/kc/vm00399/kubeconfig logs -n open-cluster-management-agent-addon                cert-policy-controller-fd968b76-pb9j6   | head -1
      2025-02-19T19:44:19.252Z    info    setup    app/main.go:107    Using    {"OperatorVersion": "3.6.0", "GoVersion": "go1.23.4 (Red Hat 1.23.4-1.el9) X:strictfipsruntime", "GOOS": "linux", "GOARCH": "amd64"}
      oc --kubeconfig=/root/hv-vm/kc/vm00399/kubeconfig logs -n open-cluster-management-agent-addon                cluster-proxy-proxy-agent-79dd4cf99b-jhxpl   | head -1
      Defaulted container "proxy-agent" out of: proxy-agent, addon-agent, service-proxy
      I0219 19:20:30.696762       1 options.go:124] AgentCert set to "/etc/tls/tls.crt".
      oc --kubeconfig=/root/hv-vm/kc/vm00399/kubeconfig logs -n open-cluster-management-agent-addon                config-policy-controller-74c485ccf8-pdvhl   | head -1 
      2025-02-19T19:44:15.239Z    info    setup    app/main.go:163    Using    {"OperatorVersion": "0.0.1", "GoVersion": "go1.23.4 (Red Hat 1.23.4-1.el9) X:strictfipsruntime", "GOOS": "linux", "GOARCH": "amd64"}
      oc --kubeconfig=/root/hv-vm/kc/vm00399/kubeconfig logs -n open-cluster-management-agent-addon                governance-policy-framework-5bd97cdfff-nksrn    | head -1
      2025-02-19T19:49:42.973Z    info    setup    app/main.go:86    Using    {"OperatorVersion": "0.0.1", "GoVersion": "go1.23.4 (Red Hat 1.23.4-1.el9) X:strictfipsruntime", "GOOS": "linux", "GOARCH": "amd64"}
      oc --kubeconfig=/root/hv-vm/kc/vm00399/kubeconfig logs -n open-cluster-management-agent-addon                klusterlet-addon-workmgr-6b8466b849-ng4hk     | head -1
      W0219 19:20:32.402088       1 client_config.go:667] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
      oc --kubeconfig=/root/hv-vm/kc/vm00399/kubeconfig logs -n open-cluster-management-agent-addon               managed-serviceaccount-addon-agent-dffdc47f6-sdm5q     | head -1
      I0219 19:20:32.119897       1 agent.go:271] heath probes server is running... 

       oc --kubeconfig=/root/hv-vm/kc/vm00399/kubeconfig get pod -n open-cluster-management-agent-addon
      NAME                                                 READY   STATUS    RESTARTS   AGE
      application-manager-75c46fb9fd-n5l4v                 1/1     Running   0          4h24m
      cert-policy-controller-fd968b76-pb9j6                1/1     Running   0          4h
      cluster-proxy-proxy-agent-79dd4cf99b-jhxpl           3/3     Running   0          4h24m
      config-policy-controller-74c485ccf8-pdvhl            1/1     Running   0          4h
      governance-policy-framework-5bd97cdfff-nksrn         1/1     Running   0          3h55m
      klusterlet-addon-workmgr-6b8466b849-ng4hk            1/1     Running   0          4h24m
      managed-serviceaccount-addon-agent-dffdc47f6-sdm5q   1/1     Running   0          4h24m

      oc --kubeconfig=/root/hv-vm/kc/vm00528/kubeconfig logs -n open-cluster-management-agent-addon                application-manager-78899f66f6-vf6cc  | head -1
      I0219 15:32:oc --kubeconfig=/root/hv-vm/kc/vm00528/kubeconfig logs -n open-cluster-management-agent-addon                cert-policy-controller-6f6969976-rnc68  | head -1
      2025-02-19T15:31:46.960Z    info    setup    app/main.go:107    Using    {"OperatorVersion": "3.6.0", "GoVersion": "go1.23.4 (Red Hat 1.23.4-1.el9) X:strictfipsruntime", "GOOS": "linux", "GOARCH": "amd64"}03.580429       1 kubernetes.go:76] App Addon Pod NS = open-cluster-management-agent-addon
      oc --kubeconfig=/root/hv-vm/kc/vm00528/kubeconfig logs -n open-cluster-management-agent-addon                config-policy-controller-7db4865c5d-v4xw4  -p | head -1
      2025-02-19T15:31:40.123Z    info    setup    app/main.go:163    Using    {"OperatorVersion": "0.0.1", "GoVersion": "go1.23.4 (Red Hat 1.23.4-1.el9) X:strictfipsruntime", "GOOS": "linux", "GOARCH": "amd64"}
      oc --kubeconfig=/root/hv-vm/kc/vm00528/kubeconfig logs -n open-cluster-management-agent-addon                governance-policy-framework-f56649f57-qss8t -p | head -1
      2025-02-19T15:31:40.262Z    info    setup    app/main.go:86    Using    {"OperatorVersion": "0.0.1", "GoVersion": "go1.23.4 (Red Hat 1.23.4-1.el9) X:strictfipsruntime", "GOOS": "linux", "GOARCH": "amd64"}

      Version-Release number of selected component (if applicable):

      How reproducible:

      Steps to Reproduce:

      1.  
      2.  
      3. ...

      Actual results:

      Expected results:

      Additional info:

              jkulikau@redhat.com Justin Kulikauskas
              rhn-support-txue Ting Xue
              Derek Ho Derek Ho
              ACM QE Team
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: