Uploaded image for project: 'Red Hat Advanced Cluster Management'
  1. Red Hat Advanced Cluster Management
  2. ACM-28200

clusterdeployment status not updating

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Not a Bug
    • Icon: Major Major
    • None
    • MCE 2.10.0, ACM 2.15.0
    • Hive
    • None
    • Important
    • None

      Description of problem:

      after following https://docs.redhat.com/en/documentation/red_hat_advanced_cluster_management_for_kubernetes/2.15/html-single/troubleshooting/index#troubleshooting-imported-clusters-offline-after-certificate-change to fix a TLS issue with a spoke cluster and seeing the managedcluster update and klusterlet on the spoke updating, the clusterdeployment for that cluster on rhacm remains with an "unreachable" status with no visible other status updates being added or changed on the clusterdeployment since the incident first was detected

      Version-Release number of selected component (if applicable):

      rhacm 2.15

      How reproducible:

      customer environment

      Steps to Reproduce:

      1. deploy clusters with rhacm (and gitops to deploy apps) with rhacm 2.14
      2. update to RHACM 2.15
      3. let api certificates expire
      4. change the certificate and follow https://docs.redhat.com/en/documentation/red_hat_advanced_cluster_management_for_kubernetes/2.15/html-single/troubleshooting/index#troubleshooting-imported-clusters-offline-after-certificate-change

      Actual results:

      the managedcluster seems to reconnect, the klusterlet on the spoke cluster reconnects but the clusterdeployment keeps the status "unreachable"

      Expected results:

      The clusterdeployment is updated after the steps taken in https://docs.redhat.com/en/documentation/red_hat_advanced_cluster_management_for_kubernetes/2.15/html-single/troubleshooting/index#troubleshooting-imported-clusters-offline-after-certificate-change

      Additional info:

      last status update seen in december when the certificate expired :

        - lastProbeTime: "2025-12-15T09:45:02Z"
          lastTransitionTime: "2025-11-27T15:49:05Z"
          message: 'Get "https://api.spokecluster.example.com:6443/api?timeout=32s": tls:
            failed to verify certificate: x509: certificate signed by unknown authority'
          reason: ErrorConnectingToCluster
          status: "True"
          type: Unreachable
      

      the last updates to the clusterdeployment seen are

        - apiVersion: hive.openshift.io/v1
          fieldsType: FieldsV1
          fieldsV1:
            f:status:
              f:conditions: {}
          manager: hive1-unreachable
          operation: Update
          subresource: status
          time: "2025-12-15T09:45:02Z"
        - apiVersion: hive.openshift.io/v1
          fieldsType: FieldsV1
          fieldsV1:
            f:metadata:
              f:finalizers:
                v:"hive.openshift.io/deprovision": {}
              f:labels:
                f:hive.openshift.io/cluster-platform: {}
            f:spec:
              f:clusterMetadata:
                .: {}
                f:adminKubeconfigSecretRef: {}
                f:adminPasswordSecretRef: {}
                f:clusterID: {}
                f:infraID: {}
                f:metadataJSONSecretRef: {}
              f:installed: {}
          manager: hive1-clusterDeployment
          operation: Update
          time: "2025-12-16T12:36:46Z"
      

      the corrective steps were taken in January.

              efried.openshift Eric Fried
              rhn-support-fdewaley Felix Dewaleyne
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: