Loading...

XML

Word

Printable

Type: Bug
Resolution: Not a Bug
Priority: Undefined
Fix Version/s: None
Affects Version/s: 4.12
Component/s: Networking / router
Labels:
- perfscale-ovn

Regression:
None
Story Points:
1
Sprint:
Sprint 228
sprint_count:
1
Release Blocker:
Rejected
Blocked:
False
Blocked Reason:

Hide

None

Show
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Description of problem:

After migrating a cluster's worker node instances with larger instance types, multiple cluster operators are degraded and authentication and console cluster operators are not available. 

oc get co | egrep -v 'True.*False.*False'
NAME                                       VERSION                              AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
authentication                             4.12.0-0.nightly-2022-10-05-053337   False       False         True       3h46m   OAuthServerRouteEndpointAccessibleControllerAvailable: Get "https://oauth-openshift.apps.sv-aws-412.qe.devcluster.openshift.com/healthz": EOF
console                                    4.12.0-0.nightly-2022-10-05-053337   False       False         False      3h46m   RouteHealthAvailable: failed to GET route (https://console-openshift-console.apps.sv-aws-412.qe.devcluster.openshift.com): Get "https://console-openshift-console.apps.sv-aws-412.qe.devcluster.openshift.com": EOF

Version-Release number of selected component (if applicable):

# oc version
Client Version: 4.12.0-0.nightly-2022-10-05-053337
Kustomize Version: v4.5.4
Server Version: 4.12.0-0.nightly-2022-10-05-053337
Kubernetes Version: v1.25.0+3ef6ef3

How reproducible:

Replace loaded worker nodes with larger instance types and perform cluster health check.

Steps to Reproduce:

1. Create a cluster with 3 master nodes ('m5.xlarge') and 30 worker nodes ('m5.xlarge') with OVN.
2. Run kube-burner cluster-density workload (https://github.com/cloud-bulldozer/e2e-benchmarking)
3. Note CPU and Memory resource usage of Master nodes
4. Create a new machineset of 15 worker nodes with ('m5.2xlarge') instance type.
5. Scale down existing machineset (using 'm5.xlarge') one at a time to 0.
6. All cluster-density pods should be successfully migrated to new machineset.
7. Delete all cluster-density namespaces
8. Rerun kube burner cluster-density test on the cluster with only new machineset.
9. Notice that the test fails trying to get Route for prometheus and when you check cluster health: authentication, console and ingress cluster operators are reported as degraded.

Actual results:

oc get co | egrep -v 'True.*False.*False'
NAME                                       VERSION                              AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
authentication                             4.12.0-0.nightly-2022-10-05-053337   False       False         True       3h46m   OAuthServerRouteEndpointAccessibleControllerAvailable: Get "https://oauth-openshift.apps.sv-aws-412.qe.devcluster.openshift.com/healthz": EOF
console                                    4.12.0-0.nightly-2022-10-05-053337   False       False         False      3h46m   RouteHealthAvailable: failed to GET route (https://console-openshift-console.apps.sv-aws-412.qe.devcluster.openshift.com): Get "https://console-openshift-console.apps.sv-aws-412.qe.devcluster.openshift.com": EOF

Expected results:

No cluster operators should be degraded.
Should be able to run kube burner cluster-density workload successfully on new machineset.

Additional info:

Assignee:: Grant Spence

Reporter:: Sharada Vetsa

QA Contact:: Ke Wang

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Created:: 2022/10/15 2:01 AM

Updated:: 2022/12/02 9:51 PM

Resolved:: 2022/12/02 2:49 PM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates