-
Feature Request
-
Resolution: Done
-
Major
-
None
-
None
-
None
-
False
-
False
-
-
-
-
1. Proposed title of this feature request
When the network is unstable, crio is unable to pull images from docker.io while docker cli can. It makes OCP4 less tolerant to network outage than OCP3
2. What is the nature and description of the request?
Docker cli has a retry mechanism per blob which is very efficient which is not the case with crio stack.
There is an RFE upstream https://github.com/containers/image/issues/1145
Example on customer's infrastructure
- Pull with buildah --> KO
- https_proxy=<obfuscated> buildah pull docker://docker.io/image:tag
Getting image source signatures
Copying blob 04f220ee9266 done
Copying blob 034655750c88 done
Copying blob f0b757a2a0f0 done
Copying blob 89c8a77f7842 done
Copying blob 4bbcce26bc5e done
Copying blob 7b1a6ab2e44d done
Copying blob d1de5652303b done
Copying blob ef669123e59e done
Copying blob b14b1ba1d651 done
Copying blob e5cec468d3a6 done
while pulling "docker://docker.io/image:tag" as "docker.io/image:tag": Error writing blob: error storing blob to file "/var/tmp/storage487953815/6": read tcp x.x.x.x:port->x.x.x.x:3128: read: connection reset by peer
- Pull with docker --> OK
- https_proxy=<obfuscated> docker pull docker.io/image:tag
Trying to pull repository docker.io/library/image ...
10.6.5: Pulling from docker.io/library/image
7b1a6ab2e44d: Pull complete
034655750c88: Pull complete
f0b757a2a0f0: Pull complete
4bbcce26bc5e: Pull complete
04f220ee9266: Pull complete
89c8a77f7842: Pull complete
d1de5652303b: Pull complete
ef669123e59e: Pull complete
e5cec468d3a6: Pull complete
b14b1ba1d651: Pull complete
Status: Downloaded newer image for docker.io/image:tag
Docker logs show that it has to retry a blob when pulling the image
- journalctl -u docker | grep resume
dockerd-current[pid]: time="" level=debug msg="attempting to resume download of \"sha256:7b1a6ab2e44d...\" from 653653 bytes"
3. Why does the customer need this? (List the business requirements here)
The customer is planning the migration from OCP 3.11 to 4.x but this issue make them worried about the cluster/applications availability when the network is getting unstable or the company proxy drops a connection.
4. List any affected packages or components.
Crio stack on CoreOS 4.9/RHEL 8.4
- relates to
-
RUN-1558 Make image pulls more resilient
- Closed
- links to