Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Critical
Fix Version/s: None
Affects Version/s: 4.20.0
Component/s: Networking / ovn-kubernetes
Labels:

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
None
Regression:
None

Target Backport Versions:
None
Target Version:
None
Release Blocker:
Proposed
Sprint:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

TRT has found a regression in in-cluster disruption that showed a pretty defined pattern, which we do not see in 4.19.

During node upgrades when one of the masters is upgrading, (seemingly the second master to update), we get a brief disruption to almost all api endpoints, but interestingly also to a bunch of -to-pod internal networking backends all going to the master which is upgrading. Interestingly we do not see any -to-host or to-service backends failing at this time, indicating it's only the pod network that seems to be affected?

For reference, the in-cluster networking shuts itself off when it sees a hosts IPs disappear from the endpoint slice indicating it is being rebooted/upgraded. (iirc)

Examples:

https://sippy.dptools.openshift.org/sippy-ng/job_runs/1931506638221479936/periodic-ci-openshift-release-master-nightly-4.20-upgrade-from-stable-4.19-e2e-metal-ipi-upgrade-ovn-ipv6/intervals?end=2025-06-08T03%3A15%3A53Z&filterText=&intervalFile=e2e-timelines_spyglass_20250608-020648.json&overrideDisplayFlag=0&selectedSources=OperatorAvailable&selectedSources=OperatorProgressing&selectedSources=OperatorDegraded&selectedSources=KubeletLog&selectedSources=EtcdLog&selectedSources=EtcdLeadership&selectedSources=Alert&selectedSources=Disruption&selectedSources=E2EFailed&selectedSources=KubeEvent&selectedSources=NodeState&start=2025-06-08T00%3A40%3A00Z

https://sippy.dptools.openshift.org/sippy-ng/job_runs/1933886484792741888/periodic-ci-openshift-release-master-nightly-4.20-upgrade-from-stable-4.19-e2e-metal-ipi-upgrade-ovn-ipv6/intervals?end=2025-06-14T16%3A57%3A18Z&filterText=&intervalFile=e2e-timelines_spyglass_20250614-154711.json&overrideDisplayFlag=0&selectedSources=OperatorDegraded&selectedSources=Disruption&selectedSources=E2EFailed&selectedSources=NodeState&start=2025-06-14T16%3A3630Z

https://sippy.dptools.openshift.org/sippy-ng/job_runs/1934296704639569920/periodic-ci-openshift-release-master-nightly-4.20-upgrade-from-stable-4.19-e2e-metal-ipi-upgrade-ovn-ipv6/intervals?end=2025-06-15T20%3A09%3A42Z&filterText=&intervalFile=e2e-timelines_spyglass_20250615-185807.json&overrideDisplayFlag=0&selectedSources=OperatorAvailable&selectedSources=OperatorProgressing&selectedSources=OperatorDegraded&selectedSources=KubeletLog&selectedSources=EtcdLog&selectedSources=EtcdLeadership&selectedSources=Alert&selectedSources=Disruption&selectedSources=E2EFailed&selectedSources=APIServerGracefulShutdown&selectedSources=KubeEvent&selectedSources=NodeState&start=2025-06-15T17%3A30%3A53Z

For more runs see the job runs panel on this dashboard.

Most hits come from periodic-ci-openshift-release-master-nightly-4.20-upgrade-from-stable-4.19-e2e-metal-ipi-upgrade-ovn-ipv6 but notice one from periodic-ci-openshift-release-master-nightly-4.20-upgrade-from-stable-4.19-e2e-metal-ipi-ovn-upgrade shows up with the same pattern. Note that no 4.19 jobs show up despite being included in the report.

The problem potentially started around June 5th-6th, but this job does not run a lot so it's possible it was earlier.

Assignee:: Weibin Liang

Reporter:: Devan Goodwin

Need Info From:: None

Contributors:: None

QA Contact:: Weibin Liang

Doc Contact:: None

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Created:: 2025/06/17 4:43 PM

Updated:: 2025/08/19 2:56 PM

Resolved:: 2025/08/19 2:56 PM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates