Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-37219

ERRO[0000] exec failed: unable to start container process: read init-p: connection reset by peer command terminated with exit code 255

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Critical Critical
    • 4.13.z
    • 4.13
    • Node / CRI-O
    • Critical
    • None
    • False
    • Hide

      None

      Show
      None

      Description of problem:
      Post upgrade to ROSA cluster version 4.13.43 cx is not able to access the Pod terminals via oc rsh on some of their pods.
      Below are the error messages they see:

      ERRO[0000] exec failed: unable to start container process: read init-p: connection reset by peer 
      command terminated with exit code 255
      

      We have isolated the issues with the tight pod limits. Also isolated the issue with Twistlock. Disabled and removed Twistlock completely, but they are still facing the same issue. Reference KCS: https://access.redhat.com/solutions/7062219 and https://access.redhat.com/solutions/3335421.

      We went ahead and tried setting up the --pod-pids-limit=16384 by following these docs: https://docs.openshift.com/rosa/rosa_cluster_admin/rosa-configuring-pid-limits.html because the read init-p: connection reset by peer error message is related to exhausted PID limit. But that also did not fix the issue.

      It would be also worth investigating the release notes for version 4.13.43 to which customer upgraded and see what changed at the COREOS level.

      The 4.12-to-4.13 upgrade has a RHEL 8 to RHEL 9 bump https://docs.openshift.com/container-platform/4.13/release_notes/ocp-4-13-release-notes.html#ocp-4-13-rhcos-rhel-9-2-packages .

      Version-Release number of selected component (if applicable):
      4.13.43

      Cluster ID: fc39e80e-d2a5-40d5-8d7d-d91a31e24106

      Related OHSS ticket: https://issues.redhat.com/browse/OHSS-35807
      Related slack thread on #sd-sre-platform: https://redhat-internal.slack.com/archives/CCX9DB894/p1721063734654469

       

              rh-ee-kehannon Kevin Hannon
              rhn-support-dmohapat Digvijay Mohapatra
              David Darrah David Darrah (Inactive)
              Votes:
              1 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated:
                Resolved: