Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-267

Container creation errors causeed by crio goroutines stuck in semaphore wait (not dbus related)

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Critical Critical
    • 4.9.z
    • 4.9.z
    • Node / CRI-O
    • None
    • Important
    • None
    • False
    • Hide

      None

      Show
      None
    • Customer Escalated

      Description of problem:
      On a node configured with 500 pods per node, Pods fail to property start or return other failures

      egrep 'CreateContainerError|ImageInspectError' pod_list.txt | wc -l
      334

      Version-Release number of selected component (if applicable):
      OpenShift 4.9.45
      Cri-o: cri-o://1.22.5-7.rhaos4.9.git3dbcd3c.el8

      How reproducible: Frequently

      Steps to Reproduce:
      1. schedule more than 250 pods on a node as a single operation, for instance after rebooting a node on busy cluster

      Actual results: Pods with containers errors

      Expected results: Pod running

      Additional info:
      Looking at the crio stack trace of the issue, this looks that the fix at https://bugzilla.redhat.com/show_bug.cgi?id=2082344 did not catch completely the problem.

              pehunt@redhat.com Peter Hunt
              rhn-support-ekasprzy Emmanuel Kasprzyk
              Weinan Liu Weinan Liu
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated:
                Resolved: