Uploaded image for project: 'OpenShift Request For Enhancement'
  1. OpenShift Request For Enhancement
  2. RFE-2306

Make image pull with crio as resilient as with docker on unstable network

XMLWordPrintable

    • Icon: Feature Request Feature Request
    • Resolution: Done
    • Icon: Major Major
    • None
    • None
    • None
    • False
    • False

      1. Proposed title of this feature request

      When the network is unstable, crio is unable to pull images from docker.io while docker cli can. It makes OCP4 less tolerant to network outage than OCP3

      2. What is the nature and description of the request?

      Docker cli has a retry mechanism per blob which is very efficient which is not the case with crio stack. 

      There is an RFE upstream https://github.com/containers/image/issues/1145

      Example on customer's infrastructure

      • Pull with buildah --> KO
      1. https_proxy=<obfuscated> buildah pull docker://docker.io/image:tag
        Getting image source signatures
        Copying blob 04f220ee9266 done
        Copying blob 034655750c88 done
        Copying blob f0b757a2a0f0 done
        Copying blob 89c8a77f7842 done
        Copying blob 4bbcce26bc5e done
        Copying blob 7b1a6ab2e44d done
        Copying blob d1de5652303b done
        Copying blob ef669123e59e done
        Copying blob b14b1ba1d651 done
        Copying blob e5cec468d3a6 done
        while pulling "docker://docker.io/image:tag" as "docker.io/image:tag": Error writing blob: error storing blob to file "/var/tmp/storage487953815/6": read tcp x.x.x.x:port->x.x.x.x:3128: read: connection reset by peer
      • Pull with docker --> OK
      1. https_proxy=<obfuscated> docker pull docker.io/image:tag
        Trying to pull repository docker.io/library/image ...
        10.6.5: Pulling from docker.io/library/image
        7b1a6ab2e44d: Pull complete
        034655750c88: Pull complete
        f0b757a2a0f0: Pull complete
        4bbcce26bc5e: Pull complete
        04f220ee9266: Pull complete
        89c8a77f7842: Pull complete
        d1de5652303b: Pull complete
        ef669123e59e: Pull complete
        e5cec468d3a6: Pull complete
        b14b1ba1d651: Pull complete
        Status: Downloaded newer image for docker.io/image:tag

      Docker logs show that it has to retry a blob when pulling the image

      1. journalctl -u docker | grep resume
        dockerd-current[pid]: time="" level=debug msg="attempting to resume download of \"sha256:7b1a6ab2e44d...\" from 653653 bytes"

      3. Why does the customer need this? (List the business requirements here)

      The customer is planning the migration from OCP 3.11 to 4.x but this issue make them worried about the cluster/applications availability when the network is getting unstable or the company proxy drops a connection.

      4. List any affected packages or components.

      Crio stack on CoreOS 4.9/RHEL 8.4

              gausingh@redhat.com Gaurav Singh
              rhn-support-fgleizes Florian Gleizes (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated:
                Resolved: