Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-11366

[release-4.11] Uploading large layers fails with "blob upload invalid"

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Normal Normal
    • None
    • 4.11.z
    • Image Registry
    • Moderate
    • No
    • False
    • Hide

      None

      Show
      None

      This is a clone of issue OCPBUGS-10496. The following is the description of the original issue:

      Description of problem:

      Customer is running machine learning (ML) tasks on OpenShift Container Platform, for which large models need to be embedded in the container image. When building a new container image with large container image layers (>=10GB) and pushing it to the internal image registry, this fails with the following error message:

      error: build error: Failed to push image: writing blob: uploading layer to https://image-registry.openshift-image-registry.svc:5000/v2/example/example-image/blobs/uploads/b305b374-af79-4dce-afe0-afe6893b0ada?_state=[..]: blob upload invalid

      In the image registry Pod we can see the following error message:

      time="2023-01-30T14:12:22.315726147Z" level=error msg="upload resumed at wrong offest: 10485760000 != 10738341637" [..]
      time="2023-01-30T14:12:22.338264863Z" level=error msg="response completed with error" err.code="blob upload invalid" err.message="blob upload invalid" [..]

      Backend storage is AWS S3. We suspect that this could be the following upstream bug: https://github.com/distribution/distribution/issues/1698

      Version-Release number of selected component (if applicable):

      Customer encountered the issue on OCP 4.11.20. We reproduced the issue on OCP 4.11.21:
      
      $  oc version
      Client Version: 4.12.0
      Kustomize Version: v4.5.7
      Server Version: 4.11.21
      Kubernetes Version: v1.24.6+5658434

      How reproducible:

      Always

      Steps to Reproduce:

      1. Install OpenShift Container Platform cluster 4.11.21 on AWS
      2. Confirm registry storage is on AWS S3
      3. Create a new build including a 10GB file using the following command: `printf "FROM registry.fedoraproject.org/fedora:37\nRUN dd if=/dev/urandom of=/bigfile bs=1M count=10240" | oc new-build -D -`
      4. Wait for some time for the build to run

      Actual results:

      Pushing the new build fails with the following error message:
      
      error: build error: Failed to push image: writing blob: uploading layer to https://image-registry.openshift-image-registry.svc:5000/v2/example/example-image/blobs/uploads/b305b374-af79-4dce-afe0-afe6893b0ada?_state=[..]: blob upload invalid

      Expected results:

      Push of large container image layers succeeds

      Additional info:

              fmissi Flavian Missi
              openshift-crt-jira-prow OpenShift Prow Bot
              XiuJuan Wang XiuJuan Wang
              Votes:
              1 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated:
                Resolved: