-
Bug
-
Resolution: Unresolved
-
Critical
-
None
-
4.20.0
-
Quality / Stability / Reliability
-
False
-
-
None
-
Important
-
None
-
None
-
None
-
Rejected
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
Following the steps from Disaster recovery docs for with OADP 1.5, the restored cluster does not have Nodes ready. The Hypershift operator also logs errors related to security groups:
{"level":"error","ts":"2025-07-28T11:34:24Z","msg":"Failed to reconcile NodePool","controller":"nodepool","controllerGroup":"hypershift.openshift.io","controllerKind":"NodePool","NodePool":{"name":"hc1-us-east-1a","namespace":"clusters"},"namespace":"clusters","name":"hc1-us-east-1a","reconcileID":"329e3719-7d3f-4728-b94d-bf4cac23c7bc","error":"failed to create machine template: failed to generate AWSMachineTemplateSpec: the default security group for the HostedCluster has not been created","stacktrace":"github.com/openshift/hypershift/hypershift-operator/controllers/nodepool.(*NodePoolReconciler).Reconcile\n\t/hypershift/hypershift-operator/controllers/nodepool/nodepool_controller.go:236\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Reconcile\n\t/hypershift/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:116\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler\n\t/hypershift/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:303\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem\n\t/hypershift/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:263\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func2.2\n\t/hypershift/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:224"}
Version-Release number of selected component (if applicable):
OCP 4.20 (4.20.0-0.nightly-multi-2025-07-23-044404)
OADP plugin quay.io/redhat-user-workloads/ocp-art-tenant/oadp-hypershift-oadp-plugin-main:main (from July 28)
How reproducible:
Always
Steps to Reproduce:
1. Create necessary resources on the management cluster for OADP: OADP operator subscription, DataProtectionApplication, BackupStorageLocation.
2. Create the Backup resource, see it complete successfully:
phase: Completed
progress:
itemsBackedUp: 366
totalItems: 366
3. Break the hosted cluster:
Pause the HC and NP
Delete the HCP namespace
Delete the hanged resources, they usually are the capi objects, but I use the go app called termin8
Make sure the HCP namespace is not in terminating state
Delete the HC and NP
Remove the finalizers of HP and NP
Wait until OCP resources are:
HCP Namespace should not exists
HC and NP should not exists
4. Apply the Restore resource, see it complete:
phase: Completed
progress:
itemsRestored: 367
totalItems: 367
Actual results:
All pods in HCP namespace running, HostedCluster marked as Completed. Hypershift operator throwing errors. Nodes in the hosted cluster not ready.
Expected results:
Nodes in hosted cluster being ready.
Additional info:
Link to hypershift dump: https://drive.google.com/file/d/1iPgvy8m8uKVL9FlSdvpFtuh-SAdybc3o/view?usp=sharing
- causes
-
OCPSTRAT-2547 DR integration into the Hypershift CLI
-
- In Progress
-
- is related to
-
OCPBUGS-59880 AWS Public Hypershift clusters cannot be restored automatically
-
- Closed
-
- links to