Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-17359

CI fails because it pulls "openshift/origin-node" from Docker Hub and gets rate-limited

    XMLWordPrintable

Details

    • Important
    • No
    • Sprint 240, Sprint 241
    • 2
    • Rejected
    • False
    • Hide

      None

      Show
      None
    • N/A
    • Release Note Not Required

    Description

      Description of problem

      CI is flaky because tests pull the "openshift/origin-node" image from Docker Hub and get rate-limited:

      E0803 20:44:32.429877    2066 kuberuntime_image.go:53] "Failed to pull image" err="rpc error: code = Unknown desc = reading manifest latest in docker.io/openshift/origin-node: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit" image="openshift/origin-node:latest"
      

      This particular failure comes from https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_cluster-ingress-operator/929/pull-ci-openshift-cluster-ingress-operator-master-e2e-aws-operator/16871891662673059841687189166267305984. I don't know how to search for this failure using search.ci. I discovered the rate-limiting through Loki: https://grafana-loki.ci.openshift.org/explore?orgId=1&left=%7B%22datasource%22:%22PCEB727DF2F34084E%22,%22queries%22:%5B%7B%22expr%22:%22%7Binvoker%3D%5C%22openshift-internal-ci%2Fpull-ci-openshift-cluster-ingress-operator-master-e2e-aws-operator%2F1687189166267305984%5C%22%7D%20%7C%20unpack%20%7C~%20%5C%22pull%20rate%20limit%5C%22%22,%22refId%22:%22A%22,%22editorMode%22:%22code%22,%22queryType%22:%22range%22%7D%5D,%22range%22:%7B%22from%22:%221691086303449%22,%22to%22:%221691122303451%22%7D%7D.

      Version-Release number of selected component (if applicable)

      This happened on 4.14 CI job.

      How reproducible

      I have observed this once so far, but it is quite obscure.

      Steps to Reproduce

      1. Post a PR and have bad luck.
      2. Check Loki using the following query:

      {...} {invoker="openshift-internal-ci/pull-ci-openshift-cluster-ingress-operator-master-e2e-aws-operator/*"} | unpack | systemd_unit="kubelet.service" |~ "pull rate limit"
      

      Actual results

      CI pulls from Docker Hub and fails.

      Expected results

      CI passes, or fails on some other test failure. CI should never pull from Docker Hub.

      Additional info

      We have been using the "openshift/origin-node" image in multiple tests for years. I have no idea why it is suddenly pulling from Docker Hub, or how we failed to notice that it was pulling from Docker Hub if that's what it was doing all along.

      Attachments

        Issue Links

          Activity

            People

              mmasters1@redhat.com Miciah Masters
              mmasters1@redhat.com Miciah Masters
              Melvin Joseph Melvin Joseph
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: