Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Normal
Fix Version/s: None
Affects Version/s: 4.17.z
Component/s: HyperShift
Labels:
- Hypershift
- triaged

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
Moderate
Regression:
None

Target Backport Versions:
None
Target Version:
None
Release Blocker:
None
Sprint:
Hypershift Sprint 262, Hypershift Sprint 263
sprint_count:
2

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

After the restart of the management cluster nodes, 2 pods fail.

# oc get po -A | grep -v "Completed\|Running"
NAMESPACE                                          NAME                                                              READY   STATUS                  RESTARTS         AGE
clusters-hypershift-001                            ingress-operator-5b954659b4-9s5ht                                 0/2     Init:0/1                1                5h35m
clusters-hypershift-001                            openshift-apiserver-7bdfd9f969-hrwj2                              0/3     Init:CrashLoopBackOff   13 (4m57s ago)   3h13m

As a workaround, after deleting the `openshift-apiserver` pod, all pods come up. At times, it leads to a pod failure on the hosted cluster

# oc get po -A | grep -v "Completed\|Running"
NAMESPACE                                          NAME                                                                  READY   STATUS    RESTARTS         AGE
openshift-image-registry                           image-pruner-28818720-qnkqr                                           0/1     Error     0                8h

# oc logs image-pruner-28818720-qnkqr -n openshift-image-registry
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get buildconfigs.build.openshift.io)

# oc get co
image-registry                             4.17.1    True        False         True       23h     ImagePrunerDegraded: Job has reached the specified backoff limit

It passes in the next iteration of the cronjob.

# oc get po -A | grep image
openshift-image-registry                           image-pruner-28820160-9k4hd                                           0/1     Completed   0               4h56m

# oc logs image-pruner-28820160-9k4hd -n openshift-image-registry
Warning: apps.openshift.io/v1 DeploymentConfig is deprecated in v4.14+, unavailable in v4.10000+
I1018 00:00:06.440095       7 prune.go:348] Creating image pruner with keepYoungerThan=1h0m0s, keepTagRevisions=3, pruneOverSizeLimit=<nil>, allImages=true
Summary: deleted 0 objects

# oc get co
image-registry                             4.17.1    True        False         False      44h

Version-Release number of selected component (if applicable):

4.17.1

How reproducible:

Always

Steps to Reproduce:

Restart the control plane nodes of the management cluster

Actual results:

After the restart of the management cluster nodes, 2 pods fail.

# oc get po -A | grep -v "Completed\|Running"
NAMESPACE                                          NAME                                                              READY   STATUS                  RESTARTS         AGE
clusters-hypershift-001                            ingress-operator-5b954659b4-9s5ht                                 0/2     Init:0/1                1                5h35m
clusters-hypershift-001                            openshift-apiserver-7bdfd9f969-hrwj2                              0/3     Init:CrashLoopBackOff   13 (4m57s ago)   3h13m

Expected results:

All pods should come up after restart

Additional info:

As a workaround, after deleting the `openshift-apiserver` pod, all pods come up. At times, it leads to a pod failure on the hosted cluster

# oc get po -A | grep -v "Completed\|Running"
NAMESPACE                                          NAME                                                                  READY   STATUS    RESTARTS         AGE
openshift-image-registry                           image-pruner-28818720-qnkqr                                           0/1     Error     0                8h

# oc logs image-pruner-28818720-qnkqr -n openshift-image-registry
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get buildconfigs.build.openshift.io)

# oc get co
image-registry                             4.17.1    True        False         True       23h     ImagePrunerDegraded: Job has reached the specified backoff limit

It passes in the next iteration of the cronjob.

# oc get po -A | grep image
openshift-image-registry                           image-pruner-28820160-9k4hd                                           0/1     Completed   0               4h56m

# oc logs image-pruner-28820160-9k4hd -n openshift-image-registry
Warning: apps.openshift.io/v1 DeploymentConfig is deprecated in v4.14+, unavailable in v4.10000+
I1018 00:00:06.440095       7 prune.go:348] Creating image pruner with keepYoungerThan=1h0m0s, keepTagRevisions=3, pruneOverSizeLimit=<nil>, allImages=true
Summary: deleted 0 objects

# oc get co
image-registry                             4.17.1    True        False         False      44h

Assignee:: Unassigned

Reporter:: Aishwarya Kamat (Inactive)

Need Info From:: None

Contributors:: None

QA Contact:: Elsa Passaro

Doc Contact:: None

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2024/10/18 6:59 AM

Updated:: 2025/07/20 1:11 PM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates

Hide