Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Major
Fix Version/s: None
Affects Version/s: 4.11
Component/s: Networking / router
Labels:
- migrated_from_bz

Activity Type:
Quality / Stability / Reliability
Blocked:
None
Blocked Reason:
None
Story Points:
1
Severity:
Moderate
Regression:
None
Architecture:

Unspecified

Target Backport Versions:
None
Target Version:
None
Release Blocker:
Rejected
Sprint:
Sprint 235
sprint_count:
1

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Release Note Status:
None
Release Note Type:
If docs needed, set a value
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem: In 4.11, configure timeout of liveness probe and readiness probe for the router deploy in openshift-ingress namespace with 5s, try to downgrade the cluster to 4.10, expect the timeout will change to the default 1s.
But more than 5 hours has passed, it is still in "waiting on ingress"

OpenShift release version:

Cluster Platform:
cluster access info: https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp-common/job/Flexy-install/96936/

How reproducible:
configure timeout of liveness probe and readiness probe, and then downgrade the cluster

Steps to Reproduce (in detail):
1. configure timeout of liveness probe and readiness probe
% oc -n openshift-ingress patch deploy/router-default --type=strategic --patch='{"spec":{"template":{"spec":{"containers":[{"name":"router","livenessProbe":

{"timeoutSeconds":5}

,"readinessProbe":{"timeoutSeconds":5}}]}}}}'
Warning: would violate PodSecurity "restricted:latest": allowPrivilegeEscalation != false (container "router" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "router" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "router" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "router" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
deployment.apps/router-default patched
%

2. check the configuration of timeout of liveness probe and readiness probe
% oc -n openshift-ingress get deploy/router-default -o yaml | grep -A8 nessProbe:
livenessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 1936
scheme: HTTP
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
–
readinessProbe:
failureThreshold: 3
httpGet:
path: /healthz/ready
port: 1936
scheme: HTTP
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
%

3. downgrade the cluster to 4.10.0-0.nightly-2022-04-24-083512
% oc patch clusterversion/version --patch '{"spec":{"upstream":"https://amd64.ocp.releases.ci.openshift.org/graph"}}' --type=merge
clusterversion.config.openshift.io/version patched
%

% oc adm upgrade --to-image=registry.ci.openshift.org/ocp/release:4.10.0-0.nightly-2022-04-24-083512 --allow-explicit-upgrade=true --force
warning: Using by-tag pull specs is dangerous, and while we still allow it in combination with --force for backward compatibility, it would be much safer to pass a by-digest pull spec instead
warning: The requested upgrade image is not one of the available updates.You have used --allow-explicit-upgrade for the update to proceed anyway
warning: --force overrides cluster verification of your supplied release image and waives any update precondition failures.
Updating to release image registry.ci.openshift.org/ocp/release:4.10.0-0.nightly-2022-04-24-083512
%

4. oc get clusterversion from time to time, it seems the downgrade is stuck in "waiting on ingress"
% oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.11.0-0.nightly-2022-04-24-135651 True True 3m39s Working towards 4.10.0-0.nightly-2022-04-24-083512: 95 of 771 done (12% complete)
%

% oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.11.0-0.nightly-2022-04-24-135651 True True 31m Unable to apply 4.10.0-0.nightly-2022-04-24-083512: an unknown error has occurred: MultipleErrors
%

% oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.11.0-0.nightly-2022-04-24-135651 True True 36m Working towards 4.10.0-0.nightly-2022-04-24-083512: 610 of 771 done (79% complete)
%

% oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.11.0-0.nightly-2022-04-24-135651 True True 53m Working towards 4.10.0-0.nightly-2022-04-24-083512: 611 of 771 done (79% complete), waiting on ingress
%

% oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.11.0-0.nightly-2022-04-24-135651 True True 5h30m Working towards 4.10.0-0.nightly-2022-04-24-083512: 611 of 771 done (79% complete), waiting on ingress
%

5. check the timeout, it is changed to 1s
% oc -n openshift-ingress get deploy/router-default -o yaml | grep -A8 nessProbe:
livenessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 1936
scheme: HTTP
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
–
readinessProbe:
failureThreshold: 3
httpGet:
path: /healthz/ready
port: 1936
scheme: HTTP
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
%

Actual results:
More than 5 hours passed, the downgrade hasn't been completed.

Expected results:
About 1 hour, the downgrade is successful.

Impact of the problem:

Additional info:

- Please do not disregard the report template; filling the template out as much as possible will allow us to help you. Please consider attaching a must-gather archive (via `oc adm must-gather`). Please review must-gather contents for sensitive information before attaching any must-gathers to a bugzilla report. You may also mark the bug private if you wish.

Assignee:: Miciah Masters

Reporter:: Shudi Li

Need Info From:: None

Contributors:: None

QA Contact:: Shudi Li

Doc Contact:: None

Contributing Groups:: Red Hat Employee

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Created:: 2022/04/25 1:56 PM

Updated:: 2025/07/27 5:42 PM

Resolved:: 2023/04/13 4:24 PM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates