-
Bug
-
Resolution: Done
-
Undefined
-
4.12, 4.11.z, 4.10.z
-
None
-
Quality / Stability / Reliability
-
False
-
-
None
-
Critical
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
Image registry pods are crash, failed with ProbeError when configure s3 endpoint on sts cluster
Version-Release number of selected component (if applicable):
4.12.0-0.nightly-2022-09-28-204419
How reproducible:
always
Steps to Reproduce:
1.Add custom endpoint apiVersion: config.openshift.io/v1 kind: Infrastructure metadata: creationTimestamp: "2022-09-29T05:59:43Z" generation: 2 name: cluster resourceVersion: "78860" uid: c2c145cc-0242-4044-ab65-1f13cff06407 spec: cloudConfig: name: "" platformSpec: aws: serviceEndpoints: - name: ec2 url: https://ec2.us-east-2.amazonaws.com - name: s3 url: https://s3.dualstack.us-east-2.amazonaws.com - name: sts url: https://sts.us-east-2.amazonaws.com type: AWS status: apiServerInternalURI: https://api-int.wxjsts29a.qe.devcluster.openshift.com:6443 apiServerURL: https://api.wxjsts29a.qe.devcluster.openshift.com:6443 controlPlaneTopology: HighlyAvailable etcdDiscoveryDomain: "" infrastructureName: wxjsts29a-nrjs5 infrastructureTopology: HighlyAvailable platform: AWS platformStatus: aws: region: us-east-2 serviceEndpoints: - name: ec2 url: https://ec2.us-east-2.amazonaws.com - name: s3 url: https://s3.dualstack.us-east-2.amazonaws.com - name: sts url: https://sts.us-east-2.amazonaws.com type: AWS 2. oc get pods -n openshift-image-registry -l docker-registry=default NAME READY STATUS RESTARTS AGE image-registry-688788c744-8vzqq 0/1 CrashLoopBackOff 19 (3m10s ago) 58m image-registry-688788c744-f6rjn 0/1 CrashLoopBackOff 19 (2m28s ago) 58m 3.
Actual results:
$oc describe pods image-registry-688788c744-8vzqq
Normal Scheduled 58m default-scheduler Successfully assigned openshift-image-registry/image-registry-688788c744-8vzqq to ip-10-0-220-225.us-east-2.compute.internal by ip-10-0-156-136
Normal AddedInterface 58m multus Add eth0 [10.128.2.13/23] from openshift-sdn
Normal Killing 57m kubelet Container registry failed liveness probe, will be restarted
Warning ProbeError 57m (x3 over 58m) kubelet Liveness probe error: HTTP probe failed with statuscode: 503
body: {"errors":[{"code":"UNAVAILABLE","message":"service unavailable","detail":"health check failed: please see /debug/health"}]}
Warning Unhealthy 57m (x3 over 58m) kubelet Liveness probe failed: HTTP probe failed with statuscode: 503
Warning Unhealthy 57m (x6 over 58m) kubelet Readiness probe failed: HTTP probe failed with statuscode: 503
Normal Started 57m (x2 over 58m) kubelet Started container registry
Normal Created 57m (x2 over 58m) kubelet Created container registry
Normal Pulled 53m (x6 over 58m) kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:66f0732e3f5494e22620ed4b5df564889ae3251a8f5ece9359c7ba8d6772146c" already present on machine
Warning ProbeError 18m (x79 over 58m) kubelet Readiness probe error: HTTP probe failed with statuscode: 503
body: {"errors":[{"code":"UNAVAILABLE","message":"service unavailable","detail":"health check failed: please see /debug/health"}]}
Warning BackOff 3m24s (x173 over 52m) kubelet Back-off restarting failed container
Expected results:
Image registry should be running with such overide
Additional info: