Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Normal
Fix Version/s: 4.19.z
Affects Version/s: 4.19
Component/s: HyperShift
Labels:
- hcp
- ocp-4.19
- rits-work
- triaged

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
Moderate
Regression:
None

Target Backport Versions:
None
Target Version:

4.19.z
Release Blocker:
None
Sprint:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
In Progress
Release Note Type:
Release Note Not Required
Release Note Text:
Fixed an issue where Hosted Cluster control plane pods could get stuck after Hosting Cluster reboot.

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

This is a clone of issue ~~OCPBUGS-63128~~. The following is the description of the original issue:
—
This is a clone of issue ~~OCPBUGS-61829~~. The following is the description of the original issue:
—
Description of problem:

When a Hosting Cluster (OpenShift cluster running HyperShift control planes) undergoes a restart, the Hosted Clusters experience issues with its Ingress functionality. After the Hosting Cluster is back online, the Hosted Clusters remain partially degraded until manual intervention is performed.

The issue specifically manifests as stuck or unhealthy pods in the Hosted Clusters' control plane namespaces, preventing proper recovery.

To restore functionality, it is necessary to manually delete the failing pods in the affected Hosted Clusters namespace, forcing Kubernetes to reschedule them. This workaround is consistently required after a Hosting Cluster reboot.

Version-Release number of selected component (if applicable):

How reproducible:

Always

Steps to Reproduce:

1. Deploy a Hosting Cluster running HyperShift.
2. Configure hosting inventory for agent nodes (baremetal)
3. Deploy one or more Hosted Clusters using the Hosting Cluster.
4. Restart the Hosting Cluster nodes (full cluster reboot).
5. Observe the state of the Hosted Clusters after the Hosting Cluster becomes available again.

Actual results:

- Hosted Cluster control plane namespaces contain pods in non-Running states (e.g., CrashLoopBackOff, Error, or Pending).
- Ingress operator for the Hosted Clusters does not function until corrective actions are taken.

Expected results:

- Hosted Clusters should fully recover after the Hosting Cluster is restarted, without requiring manual intervention.

Additional info:

# oc -n hc-cluster-ns get all | egrep -i "ingress-operator|openshift-apiserver"
Warning: apps.openshift.io/v1 DeploymentConfig is deprecated in v4.14+, unavailable in v4.10000+
Warning: kubevirt.io/v1 VirtualMachineInstancePresets is now deprecated and will be removed in v2.
pod/ingress-operator-69dd78565c-dss2t                     0/2     Init:0/1                1              19h
pod/openshift-apiserver-77c88f8cc-hddsp                   0/3     Init:CrashLoopBackOff   10 (49s ago)   19h
service/openshift-apiserver                  ClusterIP      172.30.xxx.xxx   <none>          443/TCP             8d
deployment.apps/ingress-operator                     0/1     1            0           8d
deployment.apps/openshift-apiserver                  0/1     1            0           8d
replicaset.apps/ingress-operator-69dd78565c                     1         1         0       8d
replicaset.apps/openshift-apiserver-77c88f8cc                   1         1         0       8d
replicaset.apps/openshift-apiserver-7c75bb4754                  0         0         0       8d

clones

OCPBUGS-63128 Pods in init and crashloop state for controlplane components if hosting cluster node is rebooted.

Closed

is blocked by

OCPBUGS-63128 Pods in init and crashloop state for controlplane components if hosting cluster node is rebooted.

Closed

links to

openshift/hypershift#7145: [release-4.19] OCPBUGS-63129: resolve initContainer permission issue after node reboot

Assignee:: Liangquan Li

Reporter:: Vedant Durgam

Contributors:: Ke Wang

QA Contact:: Wen Wang

Need Info From:: None

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Created:: 2025/10/15 10:30 AM

Updated:: 2025/11/19 5:08 AM

Resolved:: 2025/11/19 5:08 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates