Loading...

XML

Word

Printable

Type: Bug
Resolution: Obsolete
Priority: Undefined
Fix Version/s: None
Affects Version/s: 4.12.z
Component/s: TALM Operator
Labels:
- perfscale-telco-5g
- telco-5g

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
None
Regression:
No
Latest Status Summary:
9/19: telco priority pending

Target Backport Versions:
None
Target Version:

4.15.0
Release Blocker:
None
Sprint:
CNF RAN Sprint 242, CNF RAN Sprint 243, CNF RAN Sprint 244
sprint_count:
3

RH Private Keywords:

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

While upgrading 3557 from 4.12.27 to 4.12.29 and precaching all clusters before the upgrade.  One cluster failed early during the precaching with the following in the logs

# oc --kubeconfig /root/hv-vm/kc/vm00311/kubeconfig logs -n openshift-talo-pre-cache pre-cache-47nlx   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current                                 Dload  Upload   Total   Spent    Left  Speed  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (7) Failed to connect to kubernetes.default.svc port 443: Connection refusedhighThresholdPercent:  diskSize:125293548 used:30470528ERROR: not enough space for precaching

It appears that the precaching script failed to complete a curl request because of an unexpected intermittent api outage.

https://github.com/openshift-kni/cluster-group-upgrades-operator/blob/release-4.13/pre-cache/check_space#L8

Shouldn't the script retry and or have the job pod retry?

Version-Release number of selected component (if applicable):

ACM - 2.9.0-DOWNSTREAM-2023-08-28-21-42-15
Hub OCP 4.13.10
Deployed SNOs - 4.12.27
TALM - 4.13.0 (Threaded with 5 threads)

How reproducible:

Steps to Reproduce:

1.
2.
3.

Actual results:

Expected results:

Additional info:

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

vm03273.precache.fail.log
75 kB
2023/09/14 7:16 PM
vm03273-must-gather-precache-fail.tar.gz
33.81 MB
2023/09/14 8:14 PM

blocks

OCPBUGS-19520 Cluster failed to precache and did not retry because of "Failed to connect to kubernetes.default.svc port 443: Connection refused"

Closed

is cloned by

OCPBUGS-19520 Cluster failed to precache and did not retry because of "Failed to connect to kubernetes.default.svc port 443: Connection refused"

Closed

links to

openshift-kni/cluster-group-upgrades-operator#669: OCPBUGS-18905: Add retry for fetching GCHighThreshold in precaching check_space

RHEA-2023:112754 OpenShift Container Platform 4.14.0 CNF vRAN extras update

mentioned on

Merge request - Updated US source to: c894bed Add ovn-ic volumeMount (#676)

Assignee:: Saeid Askari

Reporter:: Alex Krzos

QA Contact:: Dan Radez (Inactive)

Need Info From:: None

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Created:: 2023/09/12 8:36 PM

Updated:: 2025/07/25 5:28 PM

Resolved:: 2024/10/29 3:34 PM

Details

Description

Attachments

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates