-
Bug
-
Resolution: Done
-
Critical
-
1.34.0
-
None
-
False
-
None
-
False
-
-
tracking upstream https://github.com/knative/serving/issues/15466
On SO 1.34 CI builds, having a Revision that initially failed to resolve a digest due to
Unable to fetch image "image-registry.openshift-image-registry.svc:5000/ocf-qe-images/receiverhttp":
failed to resolve image to digest: GET https://image-registry.openshift-image-registry.svc:5000/openshift/token?scope=repository%3Aocf-qe-images%2Freceiverhttp%3Apull&service=:
unexpected status code 401 Unauthorized
Failure to resolve a digest is a recoverable error, so eventually, the digest was resolved and the deployment was created and available.
The Revision however stayed in this state:
apiVersion: serving.knative.dev/v1 kind: Revision metadata: annotations: autoscaling.knative.dev/max-scale: "1" autoscaling.knative.dev/min-scale: "1" autoscaling.knative.dev/target-burst-capacity: "0" serving.knative.dev/creator: system:admin serving.knative.dev/routes: receiver30 serving.knative.dev/routingStateModified: "2024-08-12T22:28:04Z" creationTimestamp: "2024-08-12T22:28:04Z" generation: 1 labels: qe.ocf.redhat.com/role: receiver serving.knative.dev/configuration: receiver30 serving.knative.dev/configurationGeneration: "1" serving.knative.dev/configurationUID: 3a809bb8-8ba7-4a93-9d94-66a565f1612c serving.knative.dev/routingState: active serving.knative.dev/service: receiver30 serving.knative.dev/serviceUID: c8827240-822f-489b-b7d6-48848da0593f name: receiver30-00001 namespace: ksnk-dn-tls-0 ownerReferences: - apiVersion: serving.knative.dev/v1 blockOwnerDeletion: true controller: true kind: Configuration name: receiver30 uid: 3a809bb8-8ba7-4a93-9d94-66a565f1612c resourceVersion: "860881" uid: d3f218f8-4fd4-4111-b225-7e084eeb8f3d spec: containerConcurrency: 0 containers: - args: - --salt - "30" - --rejectIndexModulo - "0" - --rejectEvery - "0" - --rejectEachIndexNTimes - "0" - --durationBufferSize - "1" - --delay - 0s - --randomDelay - 0s - --idempotent - "true" - --code - "500" command: - /receiver image: image-registry.openshift-image-registry.svc:5000/ocf-qe-images/receiverhttp imagePullPolicy: IfNotPresent name: user-container readinessProbe: httpGet: path: /health port: 0 successThreshold: 1 resources: {} enableServiceLinks: false timeoutSeconds: 300 status: actualReplicas: 1 conditions: - lastTransitionTime: "2024-08-12T22:30:16Z" severity: Info status: "True" type: Active - lastTransitionTime: "2024-08-12T22:28:04Z" message: 'Unable to fetch image "image-registry.openshift-image-registry.svc:5000/ocf-qe-images/receiverhttp": failed to resolve image to digest: GET https://image-registry.openshift-image-registry.svc:5000/openshift/token?scope=repository%3Aocf-qe-images%2Freceiverhttp%3Apull&service=: unexpected status code 401 Unauthorized' reason: ContainerMissing status: "False" type: ContainerHealthy - lastTransitionTime: "2024-08-12T22:28:04Z" message: 'Unable to fetch image "image-registry.openshift-image-registry.svc:5000/ocf-qe-images/receiverhttp": failed to resolve image to digest: GET https://image-registry.openshift-image-registry.svc:5000/openshift/token?scope=repository%3Aocf-qe-images%2Freceiverhttp%3Apull&service=: unexpected status code 401 Unauthorized' reason: ContainerMissing status: "False" type: Ready - lastTransitionTime: "2024-08-12T22:30:12Z" status: "True" type: ResourcesAvailable containerStatuses: - imageDigest: image-registry.openshift-image-registry.svc:5000/ocf-qe-images/receiverhttp@sha256:e915478407c5c882346c4fc72078007fd2511d9e1796345db1873facafddf836 name: user-container desiredReplicas: 1 observedGeneration: 1
Notice the containerStatus is filled (with a revision), and the ResourcesAvailable condition is True. The ContainerHealthy condition however stays False with ContainerMissing, even though the digest resolution was successful. This causes the overall Ready state to stay False, despite the Deployment being available.
Due to other changes in Resource lifecycle , (perhaps by https://github.com/knative/serving/pull/14744/files#diff-831a9383e7db7880978acf31f7dfec777beb08b900b1d0e1c55a5aed42e602cb ) , this actually causes a regression since 1.33 in how this particular issue propagates towards the ksvc status.
When the same problem occurs with 1.33, the ksvc itself would actually turn into a Ready state. With 1.34, this causes the overall ksvc to not become Ready.