-
Bug
-
Resolution: Done
-
Undefined
-
ACM 2.7.10
-
False
-
None
-
False
-
-
-
-
Moderate
-
No
Description of problem:
When restore data containing a ManagedCluster that was adopted via Hive's Adoption ability is applied to a new ACM hub, the ManagedCluster immediately re-joins as expected, but very quickly moves into an unknown state. After two hours, the ManagedCluster re-joins on its own without issue.
Version-Release number of selected component (if applicable):
2.7.10
How reproducible:
Always
Steps to Reproduce:
- Adopt a cluster into an ACM hub that has the cluster-backup operator installed and configured. Use the instructions here to adopt: https://github.com/openshift/hive/blob/master/docs/using-hive.md#cluster-adoption
- Ensure the cluster.open-cluster-management.io/backup=true label is applied to the adopted managedcluster's admin-kubeconfig secret.
- Have the cluster-backup operator take a backup of the ACM hub.
- Redeploy the ACM hub
- Apply the restore to the newly deployed ACM hub
Actual results:
After initial restore, all ManagedClusters (deployed by the restored ACM hub as well as adopted clusters) re-join without issue:
[root@bastion.<redacted> ~]# oc get managedclusters NAME HUB ACCEPTED MANAGED CLUSTER URLS JOINED AVAILABLE AGE <redacted> true https://api.<redacted>:6443 True True 2m52s <redacted> true https://api.<redacted>:6443 True True 2m52s
Shortly after, the adopted cluster goes into an unknown state:
[root@bastion.<redacted> ~]# oc get managedclusters NAME HUB ACCEPTED MANAGED CLUSTER URLS JOINED AVAILABLE AGE local-cluster true https://api.<redacted>:6443 True True 4m9s <redacted> true https://api.<redacted>:6443 True True 25m <redacted> true https://api.<redacted>:6443 True Unknown 25m
After approximately two hours, the adopted cluster re-joins the ACM hub without any manual intervention:
[root@bastion.<redacted> ~]# oc get managedclusters NAME HUB ACCEPTED MANAGED CLUSTER URLS JOINED AVAILABLE AGE local-cluster true https://api.<redacted>:6443 True True 5h26m <redacted> true https://api.<redacted>:6443 True True 120m <redacted> true https://api.<redacted>:6443 True True 120m
Expected results:
Adopted clusters successfully re-join the ACM hub post-restore and stay joined.
Additional info:
This behavior appears to be similar to https://issues.redhat.com/browse/ACM-8746.