Loading...

XML

Word

Printable

Type: Task
Resolution: Done
Priority: Undefined
Fix Version/s: None
Affects Version/s: None
Component/s: Storage Ecosystem, Storage Platform
Labels:
- non-gating

Activity Type:
Quality / Stability / Reliability
Story Points:
0.42
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Epic Link:
cnv-storage-technical-debt-4.21
Component Fix Version(s):
None
Market:

Sprint:
CNV Storage Sprint 281

Regression:
None

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

Background
During QE test runs, we observed intermittent failures in tests using `pod.execute` / `kubectl exec` due to transient kubelet connectivity issues:

Handshake status 500 Internal Server Error
Error dialing backend: use of closed network connection (kubelet 10250)

These failures are {}not related to product bugs{} but are caused by cluster-side instability.

Objective
Investigate the feasibility of adding a retry/wrapper mechanism in the test framework for such transient failures to improve test reliability.

References

Example implementation: https://github.com/RedHatQE/openshift-python-wrapper/pull/2581
Failed test run: https://reportportal-cnv.apps.dno.ocp-hub.prod.psi.redhat.com/ui/#cnv/launches/all/146659/?item0Params=filter.in.status%3DFAILED%26page.page%3D1

Acceptance Criteria

A design or proposal for handling transient `pod.execute` failures
Recommendation on whether to implement retries globally or selectively

is caused by

CNV-72482 [POST-UPGRADE][4.20.1][STAGE][TIER-2][test-pytest-cnv-4.20-post-upgrade-marker-storage][cnv-4.20.1]

Closed

Assignee:: Ahmad Hafi

Reporter:: Ahmad Hafi

QA Contact:: Natalie Gavrielov

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Created:: 2025/11/24 1:42 PM

Updated:: 2025/12/15 9:19 AM

Resolved:: 2025/12/15 9:18 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates