-
Bug
-
Resolution: Done-Errata
-
Critical
-
4.13, 4.12, 4.11, 4.14
-
Important
-
No
-
Sprint 244
-
1
-
Rejected
-
False
-
-
N/A
-
Release Note Not Required
Description of problem
CI is flaky because tests pull the "openshift/origin-node" image from Docker Hub and get rate-limited:
E0803 20:44:32.429877 2066 kuberuntime_image.go:53] "Failed to pull image" err="rpc error: code = Unknown desc = reading manifest latest in docker.io/openshift/origin-node: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit" image="openshift/origin-node:latest"
This particular failure comes from https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_cluster-ingress-operator/929/pull-ci-openshift-cluster-ingress-operator-master-e2e-aws-operator/16871891662673059841687189166267305984. I don't know how to search for this failure using search.ci. I discovered the rate-limiting through Loki: https://grafana-loki.ci.openshift.org/explore?orgId=1&left=%7B%22datasource%22:%22PCEB727DF2F34084E%22,%22queries%22:%5B%7B%22expr%22:%22%7Binvoker%3D%5C%22openshift-internal-ci%2Fpull-ci-openshift-cluster-ingress-operator-master-e2e-aws-operator%2F1687189166267305984%5C%22%7D%20%7C%20unpack%20%7C~%20%5C%22pull%20rate%20limit%5C%22%22,%22refId%22:%22A%22,%22editorMode%22:%22code%22,%22queryType%22:%22range%22%7D%5D,%22range%22:%7B%22from%22:%221691086303449%22,%22to%22:%221691122303451%22%7D%7D.
Version-Release number of selected component (if applicable)
This happened on 4.14 CI job.
How reproducible
I have observed this once so far, but it is quite obscure.
Steps to Reproduce
1. Post a PR and have bad luck.
2. Check Loki using the following query:
{...} {invoker="openshift-internal-ci/pull-ci-openshift-cluster-ingress-operator-master-e2e-aws-operator/*"} | unpack | systemd_unit="kubelet.service" |~ "pull rate limit"
Actual results
CI pulls from Docker Hub and fails.
Expected results
CI passes, or fails on some other test failure. CI should never pull from Docker Hub.
Additional info
We have been using the "openshift/origin-node" image in multiple tests for years. I have no idea why it is suddenly pulling from Docker Hub, or how we failed to notice that it was pulling from Docker Hub if that's what it was doing all along.
- blocks
-
OCPBUGS-22433 CI fails because it pulls "openshift/origin-node" from Docker Hub and gets rate-limited
- Closed
- clones
-
OCPBUGS-17359 CI fails because it pulls "openshift/origin-node" from Docker Hub and gets rate-limited
- Closed
- is blocked by
-
OCPBUGS-22402 CI fails because it pulls "openshift/origin-node" from Docker Hub and gets rate-limited
- Closed
- links to
-
RHBA-2023:6276 OpenShift Container Platform 4.12.z bug fix update