Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-12746

CI fails because of "skipping device /dev/char/10:200 for systemd: stat /sys/dev/char/10:200: no such file or directory"

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • Major
    • None
    • 4.12
    • Node / Kubelet
    • Important
    • No
    • Rejected
    • False
    • Hide

      None

      Show
      None

    Description

      Description of problem

      These four tests are frequently failing:

      • [sig-node] Kubelet when scheduling a busybox command in a pod should print the output to logs
      • [sig-node] Container Runtime blackbox test on terminated container should report termination message from log output if TerminationMessagePolicy FallbackToLogsOnError is set
      • [sig-node] Pods should support retrieving logs from the container over websockets
      • [sig-node] Kubelet when scheduling a read only busybox container should not write to root filesystem

      The test failures all appear to be caused by an unexpected systemd error in the pod logs:

      {  fail [k8s.io/kubernetes@v1.25.0/test/e2e/common/node/kubelet.go:79]: Timed out after 60.003s.
      Expected
          <string>: time=\"2023-04-24T17:03:56Z\" level=warning msg=\"skipping device /dev/char/10:200 for systemd: stat /sys/dev/char/10:200: no such file or directory\"\nHello World\n
      to equal
          <string>: Hello World\n
      Ginkgo exit error 1: exit with code 1}
      
      {  fail [k8s.io/kubernetes@v1.25.0/test/e2e/common/node/runtime.go:167]: Expected     <string>: time=\"2023-04-24T17:07:35Z\" level=warning msg=\"skipping device /dev/char/10:200 for systemd: stat /sys/dev/char/10:200: no such file or directory\"\nDONE to equal     <string>: DONE Ginkgo exit error 1: exit with code 1}
      
      {  fail [github.com/onsi/ginkgo/v2@v2.1.5-0.20220909190140-b488ab12695a/internal/suite.go:612]: Apr 24 17:12:57.845: Unexpected websocket logs:
      time="2023-04-24T17:12:55Z" level=warning msg="skipping device /dev/char/10:200 for systemd: stat /sys/dev/char/10:200: no such file or directory"
      container is alive
      
      Ginkgo exit error 1: exit with code 1}
      
      {  fail [k8s.io/kubernetes@v1.25.0/test/e2e/common/node/kubelet.go:214]: Timed out after 60.002s.
      Expected
          <string>: "time="..."
      to equal       |
          <string>: "/bin/s..."
      Ginkgo exit error 1: exit with code 1}
      

      These above example failures come from https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_cluster-ingress-operator/910/pull-ci-openshift-cluster-ingress-operator-release-4.12-e2e-aws-ovn-single-node/1650534308475047936.

      The test failures all appear to be for 4.12.

      This issue appears to affect various platforms, including AWS, Azure, GCP, IBM Cloud, metal, oVirt, and vSphere.

      Version-Release number of selected component (if applicable)

      4.12.

      How reproducible

      Presently, a search.ci search for skipping device /dev/char/\d+:\d+ for systemd over 4.12 jobs in the past 7 days reports, "Found in 9.60% of runs (27.31% of failures) across 6519 total runs and 463 jobs (35.16% failed)".

      Steps to Reproduce

      1. Post a PR and have bad luck.
      2. Check search.ci using one of the aforementioned links.

      Actual results

      CI fails with the four aforementioned test failures.

      Expected results

      CI passes, or fails on some other test failure.

      Attachments

        Activity

          People

            kolyshkin Kirill Kolyshkin
            mmasters1@redhat.com Miciah Masters
            Sunil Choudhary Sunil Choudhary
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: