Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Major
Fix Version/s: None
Affects Version/s: 4.18.z
Component/s: Documentation / Hypershift
Labels:
- triaged

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
2
Severity:
Moderate
Regression:
None

Target Backport Versions:
None
Target Version:

4.18.z
Release Blocker:
None
Sprint:
OSDOCS Sprint 276
sprint_count:
1

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
In Progress
Release Note Type:
Bug Fix
Release Note Text:

Hide
*Cause*: What actions or circumstances cause this bug to present.
*Consequence*: What happens when the bug presents.
*Fix*: What was done to fix the bug.
*Result*: Bug doesn’t present anymore.

Show
*Cause*: What actions or circumstances cause this bug to present. *Consequence*: What happens when the bug presents. *Fix*: What was done to fix the bug. *Result*: Bug doesn’t present anymore.

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

The etcd restore procedure mentioned in below doc seems to be incomplete.

https://docs.redhat.com/en/documentation/openshift_container_platform/4.19/html-single/hosted_control_planes/index#hcp-backup-restore-on-premise

The control plane pods don't rollout automatically after following all the 4 steps. Below additional steps are required to make all the control plane pods for HCP to be running fine.

Rollout the hostedcluster manually:

oc annotate hostedcluster -n <hostedcluster-namespace> <hostedcluster-name> hypershift.openshift.io/restart-date=$(date --iso-8601=seconds)

The multus admission controller and network node identity pods still don't start.

Delete the pods for second and third members of etcd along with their PVCs:

oc delete -n $CONTROL_PLANE_NAMESPACE pvc/data-etcd-1 pod/etcd-1 --wait=false
oc delete -n $CONTROL_PLANE_NAMESPACE pvc/data-etcd-2 pod/etcd-2 --wait=false

Rollout the hostedcluster manually again:

oc annotate hostedcluster -n <hostedcluster-namespace> <hostedcluster-name> hypershift.openshift.io/restart-date=$(date --iso-8601=seconds) --overwrite

All the control plane pods start running after waiting for sometime.

Version-Release number of selected component (if applicable):

    4.18.19

How reproducible:

    100% in customer environment

Steps to Reproduce:

Follow the doc on a baremetal HCP cluster.

https://docs.redhat.com/en/documentation/openshift_container_platform/4.19/html-single/hosted_control_planes/index#hcp-backup-restore-on-premise

Actual results:

Some steps seem to be missing as the control plane pods don't start fine just by following the doc.

Expected results:

Any missing steps to be added in the docs.

Additional info:

These steps were tested in customer environment and is is required to run the additional steps mentioned every time an etcd restore with the manual method.

Assignee:: Laura Hinson

Reporter:: Alok Singh

Need Info From:: None

Contributors:: None

QA Contact:: Martin Gencur

Doc Contact:: None

Votes:: 1 Vote for this issue

Watchers:: 6 Start watching this issue

Created:: 2025/08/20 5:24 PM

Updated:: 2025/09/12 7:27 PM

Resolved:: 2025/08/27 1:04 PM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates