Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Major
Fix Version/s: None
Affects Version/s: 4.12
Component/s: Node / Kubelet
Labels:
- triaged

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
Important
Regression:
No

Target Backport Versions:
None
Target Version:
None
Release Blocker:
Rejected
Sprint:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem

These four tests are frequently failing:

[sig-node] Kubelet when scheduling a busybox command in a pod should print the output to logs
[sig-node] Container Runtime blackbox test on terminated container should report termination message from log output if TerminationMessagePolicy FallbackToLogsOnError is set
[sig-node] Pods should support retrieving logs from the container over websockets
[sig-node] Kubelet when scheduling a read only busybox container should not write to root filesystem

The test failures all appear to be caused by an unexpected systemd error in the pod logs:

{  fail [k8s.io/kubernetes@v1.25.0/test/e2e/common/node/kubelet.go:79]: Timed out after 60.003s.
Expected
    <string>: time=\"2023-04-24T17:03:56Z\" level=warning msg=\"skipping device /dev/char/10:200 for systemd: stat /sys/dev/char/10:200: no such file or directory\"\nHello World\n
to equal
    <string>: Hello World\n
Ginkgo exit error 1: exit with code 1}

{  fail [k8s.io/kubernetes@v1.25.0/test/e2e/common/node/runtime.go:167]: Expected     <string>: time=\"2023-04-24T17:07:35Z\" level=warning msg=\"skipping device /dev/char/10:200 for systemd: stat /sys/dev/char/10:200: no such file or directory\"\nDONE to equal     <string>: DONE Ginkgo exit error 1: exit with code 1}

{  fail [github.com/onsi/ginkgo/v2@v2.1.5-0.20220909190140-b488ab12695a/internal/suite.go:612]: Apr 24 17:12:57.845: Unexpected websocket logs:
time="2023-04-24T17:12:55Z" level=warning msg="skipping device /dev/char/10:200 for systemd: stat /sys/dev/char/10:200: no such file or directory"
container is alive

Ginkgo exit error 1: exit with code 1}

{  fail [k8s.io/kubernetes@v1.25.0/test/e2e/common/node/kubelet.go:214]: Timed out after 60.002s.
Expected
    <string>: "time="..."
to equal       |
    <string>: "/bin/s..."
Ginkgo exit error 1: exit with code 1}

These above example failures come from https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_cluster-ingress-operator/910/pull-ci-openshift-cluster-ingress-operator-release-4.12-e2e-aws-ovn-single-node/1650534308475047936.

The test failures all appear to be for 4.12.

A search.ci search over all jobs for the past 2 days with the pattern skipping device /dev/char/\d+:\d+ for systemd returns only 4.12 jobs.
A search.ci search over 4.12 jobs in the past 7 days sometimes times out and sometimes shows numerous failures going back several days.
A search.ci search over all jobs for the past 7 days times out.

This issue appears to affect various platforms, including AWS, Azure, GCP, IBM Cloud, metal, oVirt, and vSphere.

Version-Release number of selected component (if applicable)

4.12.

How reproducible

Presently, a search.ci search for skipping device /dev/char/\d+:\d+ for systemd over 4.12 jobs in the past 7 days reports, "Found in 9.60% of runs (27.31% of failures) across 6519 total runs and 463 jobs (35.16% failed)".

Steps to Reproduce

1. Post a PR and have bad luck.
2. Check search.ci using one of the aforementioned links.

Actual results

CI fails with the four aforementioned test failures.

Expected results

CI passes, or fails on some other test failure.

Assignee:: Kirill Kolyshkin

Reporter:: Miciah Masters

Need Info From:: None

Contributors:: None

QA Contact:: Sunil Choudhary

Doc Contact:: None

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Created:: 2023/04/25 11:16 PM

Updated:: 2025/07/27 5:33 AM

Resolved:: 2023/10/04 6:39 PM

Details

Description

Description of problem

Version-Release number of selected component (if applicable)

How reproducible

Steps to Reproduce

Actual results

Expected results

Attachments

Easy Agile Planning Poker

Activity

People

Dates