Loading...

Type: Bug
Resolution: Done
Priority: Critical
Fix Version/s: 4.14
Affects Version/s: 4.10
Component/s: Node / CRI-O
Labels:

Activity Type:
Incidents & Support
Blocked:
None
Blocked Reason:
None
Story Points:
None
Severity:
Important
Regression:
None
Architecture:

Unspecified
Latest Status Summary:

Hide
9/21: just seeking to close on this being in 4.14 then move on
8/14: pending input/response from field re: 4.12 (DM/PP); KNIECO-7503

Show
9/21: just seeking to close on this being in 4.14 then move on 8/14: pending input/response from field re: 4.12 (DM/PP); KNIECO-7503

Target Backport Versions:
None
Target Version:
None
Release Blocker:
Rejected
Sprint:
OCPNODE Sprint 233 (Blue), OCPNODE Sprint 234 (Blue), OCPNODE Sprint 235 (Blue), OCPNODE Sprint 236 (Blue), OCPNODE Sprint 237 (Green)
sprint_count:
5

Customer Impact:

Customer Escalated
Internal Whiteboard:
RH Private Keywords:

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Review Complete:
PX Priority Data:
PX Impact Score:

Release Note Status:
None
Release Note Type:
If docs needed, set a value
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:
The OLM registry-server container fails to reach the "Ready" state.

oc get pod -n openshift-marketplace
NAME READY STATUS RESTARTS AGE
marketplace-operator-7749b7db8d-br5p8 1/1 Running 4 (27h ago) 28h
rh-du-operators-ml5pl 0/1 Running 0 27h

Conditions on the pod show:
status:
conditions:

lastProbeTime: null
lastTransitionTime: "2022-05-05T15:56:21Z"
status: "True"
type: Initialized
lastProbeTime: null
lastTransitionTime: "2022-05-05T15:56:21Z"
message: 'containers with unready status: [registry-server]'
reason: ContainersNotReady
status: "False"
type: Ready
lastProbeTime: null
lastTransitionTime: "2022-05-05T15:56:21Z"
message: 'containers with unready status: [registry-server]'
reason: ContainersNotReady
status: "False"
type: ContainersReady
lastProbeTime: null
lastTransitionTime: "2022-05-05T15:56:21Z"
status: "True"
type: PodScheduled
containerStatuses:
containerID: cri-o://bff4c347d3fc6a20064926fdfd1ea3c76e039c56205c6b282d3b6c8e2f13233c
image: e24-h01-000-r640.rdu2.scalelab.redhat.com:5000/olm-mirror/redhat-operator-index:v4.9
imageID: e24-h01-000-r640.rdu2.scalelab.redhat.com:5000/olm-mirror/redhat-operator-index@sha256:86efa7af19dfaa7afe0f3469250ad6101c4eed44c7366e3628e7e865834dc43e
lastState: {}
name: registry-server
ready: false
restartCount: 0
started: true
state:
running:
startedAt: "2022-05-05T15:56:36Z"

Journal logs on the node show failures to put the readiness and liveness probe PIDs into cgroup.proc for the container:
May 06 19:06:41 sno00251 bash[26314]: E0506 19:06:41.901014 26314 remote_runtime.go:704] "ExecSync cmd from runtime service failed" err="rpc error: code = Unknown desc = command error: time=\"2022-05-06T19:06:41Z\" level=error msg=\"exec failed: unable to start container process: error adding pid 3616061 to cgroups: failed to write 3616061: open /sys/fs/cgroup/systemd/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pode1d577c2_09dc_4859_aada_0a157e0b07f0.slice/crio-bff4c347d3fc6a20064926fdfd1ea3c76e039c56205c6b282d3b6c8e2f13233c.scope/cgroup.procs: no such file or directory\"\n, stdout: , stderr: , exit code -1" containerID="bff4c347d3fc6a20064926fdfd1ea3c76e039c56205c6b282d3b6c8e2f13233c" cmd=[grpc_health_probe -addr=:50051]

May 06 19:06:41 sno00251 bash[26314]: E0506 19:06:41.901135 26314 prober.go:118] "Probe errored" err="rpc error: code = Unknown desc = command error: time=\"2022-05-06T19:06:41Z\" level=error msg=\"exec failed: unable to start container process: error adding pid 3616061 to cgroups: failed to write 3616061: open /sys/fs/cgroup/systemd/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pode1d577c2_09dc_4859_aada_0a157e0b07f0.slice/crio-bff4c347d3fc6a20064926fdfd1ea3c76e039c56205c6b282d3b6c8e2f13233c.scope/cgroup.procs: no such file or directory\"\n, stdout: , stderr: , exit code -1" probeType="Liveness" pod="openshift-marketplace/rh-du-operators-ml5pl" podUID=e1d577c2-09dc-4859-aada-0a157e0b07f0 containerName="registry-server"

May 06 19:06:41 sno00251 bash[26314]: E0506 19:06:41.907801 26314 remote_runtime.go:704] "ExecSync cmd from runtime service failed" err="rpc error: code = Unknown desc = command error: time=\"2022-05-06T19:06:41Z\" level=error msg=\"exec failed: unable to start container process: error adding pid 3616065 to cgroups: failed to write 3616065: open /sys/fs/cgroup/systemd/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pode1d577c2_09dc_4859_aada_0a157e0b07f0.slice/crio-bff4c347d3fc6a20064926fdfd1ea3c76e039c56205c6b282d3b6c8e2f13233c.scope/cgroup.procs: no such file or directory\"\n, stdout: , stderr: , exit code -1" containerID="bff4c347d3fc6a20064926fdfd1ea3c76e039c56205c6b282d3b6c8e2f13233c" cmd=[grpc_health_probe -addr=:50051]

May 06 19:06:41 sno00251 bash[26314]: E0506 19:06:41.907938 26314 prober.go:118] "Probe errored" err="rpc error: code = Unknown desc = command error: time=\"2022-05-06T19:06:41Z\" level=error msg=\"exec failed: unable to start container process: error adding pid 3616065 to cgroups: failed to write 3616065: open /sys/fs/cgroup/systemd/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pode1d577c2_09dc_4859_aada_0a157e0b07f0.slice/crio-bff4c347d3fc6a20064926fdfd1ea3c76e039c56205c6b282d3b6c8e2f13233c.scope/cgroup.procs: no such file or directory\"\n, stdout: , stderr: , exit code -1" probeType="Readiness" pod="openshift-marketplace/rh-du-operators-ml5pl" podUID=e1d577c2-09dc-4859-aada-0a157e0b07f0 containerName="registry-server"

The crio-bff4c3... directory does not exist:
[root@sno00251 core]# ls -l /sys/fs/cgroup/systemd/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pode1d577c2_09dc_4859_aada_0a157e0b07f0.slice/crio-bff4c347d3fc6a20064926fdfd1ea3c76e039c56205c6b282d3b6c8e2f13233c.scope/
ls: cannot access '/sys/fs/cgroup/systemd/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pode1d577c2_09dc_4859_aada_0a157e0b07f0.slice/crio-bff4c347d3fc6a20064926fdfd1ea3c76e039c56205c6b282d3b6c8e2f13233c.scope/': No such file or directory

[root@sno00251 core]# ls -l /sys/fs/cgroup/systemd/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pode1d577c2_09dc_4859_aada_0a157e0b07f0.slice/
total 0
~~rw-r~~r-. 1 root root 0 May 6 19:07 cgroup.clone_children
~~rw-r~~r-. 1 root root 0 May 6 19:07 cgroup.procs
drwxr-xr-x. 2 root root 0 May 5 15:56 crio-conmon-bff4c347d3fc6a20064926fdfd1ea3c76e039c56205c6b282d3b6c8e2f13233c.scope
~~rw-r~~r-. 1 root root 0 May 6 19:07 notify_on_release
~~rw-r~~r-. 1 root root 0 May 6 19:07 tasks

Version-Release number of selected component (if applicable): 4.10.13

How reproducible: 6 out of ~2200 clusters deployed in scale testing have this signature.

Steps to Reproduce:
The OLM registry-server pod is created by automated (rapid) manipulation of the catalogsources
1. Disable default sources in OperatorHub CR
2. Create new CatalogSource pointing to disconnected registry
3. Create subscriptions making use of the new CatalogSource

Actual results: CatalogSource remains in "TRANSIENT_FAILURE" state.

Expected results: CatalogSource becomes ready.

Additional info:

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

must-gather-vm00156.tar.gz
44.69 MB
2023/04/20 5:50 PM
must-gather-vm00210.tar.gz
46.19 MB
2023/04/20 5:50 PM
must-gather-vm00240.tar.gz
42.33 MB
2023/04/20 5:50 PM
must-gather-vm00382.tar.gz
45.77 MB
2023/04/20 5:50 PM
sosreport-vm00156-2023-04-20-rzddneb.tar.xz
42.43 MB
2023/04/20 5:55 PM
sosreport-vm00210-2023-04-20-vftqvqz.tar.xz
43.42 MB
2023/04/20 5:55 PM
sosreport-vm00240-2023-04-20-nznmqtl.tar.xz
42.30 MB
2023/04/20 5:55 PM
sosreport-vm00382-2023-04-20-aanswkr.tar.xz
43.54 MB
2023/04/20 5:55 PM

links to

KCS : Network operator stuck in progressing status due to "DaemonSet "/openshift-ovn-kubernetes/ovnkube-node" is not available (awaiting 1 nodes)"

Details

Description

Attachments

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates

Hide