Loading...

XML

Word

Printable

Type: Bug
Resolution: Done-Errata
Priority: Undefined
Fix Version/s: 4.15.0
Affects Version/s: 4.14.0
Component/s: Networking / cluster-network-operator
Labels:
None

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
None
Regression:
No

Target Backport Versions:
None
Target Version:

4.15.0
Release Blocker:
Rejected
Sprint:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Release Note Status:
In Progress
Release Note Type:
Release Note Not Required
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

We are seeing flakes on CNO pod restarts flake in hypershift CI on the hypershift control plane

https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_hypershift/2967/pull-ci-openshift-hypershift-main-e2e-kubevirt-aws-ovn/1699008879737704448/artifacts/e2e-kubevirt-aws-ovn/run-e2e-local/artifacts/TestCreateCluster/namespaces/e2e-clusters-pvhd5-example-s6skm/core/pods/logs/cluster-network-operator-78fd774c97-7w7dg-cluster-network-operator-previous.log

W0905 11:42:53.359515       1 builder.go:106] graceful termination failed, controllers failed with error: failed to get infrastructure name: infrastructureName not set in infrastructure 'cluster'

The current backoff is set to retry.DefaultBackoff which is appropriate for 409 conflicts and only retries for < 1s

var DefaultBackoff = wait.Backoff{
	Steps:    4,
	Duration: 10 * time.Millisecond,
	Factor:   5.0,
	Jitter:   0.1,
}

Elsewhere in the codebase, retry.DefaultBackoff is used with retry.RetryOnConflict() where it is appropriate, but we need to retry for much longer here and much less frequently.

blocks

OCPBUGS-22286 CNO pod restart in hypershift CI

Closed

is cloned by

OCPBUGS-22286 CNO pod restart in hypershift CI

Closed

links to

openshift/cluster-network-operator#1986: OCPBUGS-18569: hypershift: adjust backoff on infrastructure name retry

RHEA-2023:7198 rpm

Assignee:: Seth Jennings

Reporter:: Seth Jennings

QA Contact:: Jean Chen

Need Info From:: None

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Created:: 2023/09/05 10:48 PM

Updated:: 2025/07/25 5:35 PM

Resolved:: 2024/02/27 9:00 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates