Uploaded image for project: 'Red Hat Advanced Cluster Management'
  1. Red Hat Advanced Cluster Management
  2. ACM-16231

[ACM 2.9] CrashLoopBackOff of multicluster-engine pod "cluster-proxy-" on ACM deployment

XMLWordPrintable

    • False
    • None
    • False
    • None

      Description of problem:

          ACM operator deployment fails for MCE failed pods "cluster-proxy-".
      We've seen it FAILS with ACM 2.9 only.
      I verified it PASSES with mce 2.4, and also ACM 2.10, 2.11, 2.12. 

      Version-Release number of selected component (if applicable):

          ACM: 2.9.6-DOWNSTREAM-2024-11-18-06-32-06 
          MCE: 2.4.7-DOWNANDBACK-2024-11-15-14-35-21

      How reproducible:

          100% (for ACM 2.9 only)

      Steps to Reproduce:

          1.deploy acm operator on hub cluster, I used this job : https://auto-jenkins-csb-kniqe.apps.ocp-c1.prod.psi.redhat.com/job/CI/job/job-runner/4218/parameters/
          
          

      Actual results:

      [kni@ocp-edge77 ~]$ export KUBECONFIG=~/clusterconfigs/auth/kubeconfig 
      [kni@ocp-edge77 ~]$ oc get pod -n multicluster-engine --no-headers|grep -v Running|grep -v Completed cluster-proxy-7cd6f886ff-dvc64                         0/1   CrashLoopBackOff   25 (4m41s ago)   107m cluster-proxy-7cd6f886ff-v7c52                         0/1   CrashLoopBackOff   25 (4m19s ago)   107m 
         
      [kni@ocp-edge77 ~]$ oc get pod -n multicluster-engine cluster-proxy-7cd6f886ff-dvc64 -o json |  jq '.status.containerStatuses'
      [
        {
          "containerID": "cri-o://d9d4c586b418f6b5b39332996394a5744f32ac43a7e38bd3efcfea6b9364c58b",
          "image": "registry.redhat.io/multicluster-engine/cluster-proxy-addon-rhel8@sha256:b0f5b9cd7828fdfd03d3cfeb28ce86b3cfd00ce78b85a3f016b0e6903aeb1373",
          "imageID": "registry.redhat.io/multicluster-engine/cluster-proxy-addon-rhel8@sha256:b0f5b9cd7828fdfd03d3cfeb28ce86b3cfd00ce78b85a3f016b0e6903aeb1373",
          "lastState": {
            "terminated": {
              "containerID": "cri-o://d9d4c586b418f6b5b39332996394a5744f32ac43a7e38bd3efcfea6b9364c58b",
              "exitCode": 1,
              "finishedAt": "2024-12-09T12:43:23Z",
              "reason": "Error",
              "startedAt": "2024-12-09T12:43:23Z"
            }
          },
          "name": "proxy-server",
          "ready": false,
          "restartCount": 27,
          "started": false,
          "state": {
            "waiting": {
              "message": "back-off 5m0s restarting failed container=proxy-server pod=cluster-proxy-7cd6f886ff-dvc64_multicluster-engine(2f8e0ffd-0556-4a0b-b0a9-7d932b11fcdd)",
              "reason": "CrashLoopBackOff"
            }
          }
        }
      ]
      [kni@ocp-edge77 ~]$ 
      

      Expected results:

          ACM deployment passes, no CrashLoopBackoff status pods 

      Additional info:

          Looks like the fix provided in https://issues.redhat.com/browse/ACM-15714 for MCE is still not included in ACM. MCE version 2.4.7-DOWNANDBACK-2024-11-21-19-42-59 includes the fix

              zxue@redhat.com ZHAO XUE
              rhn-support-gamado Gal Amado
              Gal Amado Gal Amado
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: