-
Bug
-
Resolution: Done-Errata
-
Critical
-
4.15.z
Description of problem:
The default catalog source pod never gets updates, the users have to manually recreate it to get updated. Here is must-gather log for your debugging: https://drive.google.com/file/d/16_tFq5QuJyc_n8xkDFyK83TdTkrsVFQe/view?usp=drive_link
I went through the code and found the `updateStrategy` depends on the `ImageID`, see
// imageID returns the ImageID of the primary catalog source container or an empty string if the image ID isn't available yet.
// Note: the pod must be running and the container in a ready status to return a valid ImageID.
func imageID(pod *corev1.Pod) string {
if len(pod.Status.ContainerStatuses) < 1 {
logrus.WithField("CatalogSource", pod.GetName()).Warn("pod status unknown")
return ""
}
return pod.Status.ContainerStatuses[0].ImageID
}
But, for those default catalog source pods, their `pod.Status.ContainerStatuses[0].ImageID` will never change since it's the `opm` image, not index image.
jiazha-mac:~ jiazha$ oc get pods redhat-operators-mpvzm -o=jsonpath={.status.containerStatuses} |jq
[
{
"containerID": "cri-o://115bd207312c7c8c36b63bfd251c085a701c58df2a48a1232711e15d7595675d",
"image": "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:965fe452763fd402ca8d8b4a3fdb13587673c8037f215c0ffcd76b6c4c24635e",
"imageID": "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:965fe452763fd402ca8d8b4a3fdb13587673c8037f215c0ffcd76b6c4c24635e",
"lastState": {},
"name": "registry-server",
"ready": true,
"restartCount": 1,
"started": true,
"state": {
"running": {
"startedAt": "2024-03-26T04:21:41Z"
}
}
}
]
The imageID() func should return the index image ID for those default catalog sources.
jiazha-mac:~ jiazha$ oc get pods redhat-operators-mpvzm -o=jsonpath={.status.initContainerStatuses[1]} |jq
{
"containerID": "cri-o://4cd6e1f45e23aadc27b8152126eb2761a37da61c4845017a06bb6f2203659f5c",
"image": "registry.redhat.io/redhat/redhat-operator-index:v4.15",
"imageID": "registry.redhat.io/redhat/redhat-operator-index@sha256:19010760d38e1a898867262698e22674d99687139ab47173e2b4665e588635e1",
"lastState": {},
"name": "extract-content",
"ready": true,
"restartCount": 1,
"started": false,
"state": {
"terminated": {
"containerID": "cri-o://4cd6e1f45e23aadc27b8152126eb2761a37da61c4845017a06bb6f2203659f5c",
"exitCode": 0,
"finishedAt": "2024-03-26T04:21:39Z",
"reason": "Completed",
"startedAt": "2024-03-26T04:21:27Z"
}
}
}
Version-Release number of selected component (if applicable):
4.15.2
How reproducible:
always
Steps to Reproduce:
1. Install an OCP 4.16.0
2. Waiting for the redhat-operator catalog source updates
3.
Actual results:
The redhat-operator catalog source never gets updates.
Expected results:
These default catalog source should get updates depending on the `updateStrategy`.
jiazha-mac:~ jiazha$ oc get catalogsource redhat-operators -o yaml
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
annotations:
operatorframework.io/managed-by: marketplace-operator
target.workload.openshift.io/management: '{"effect": "PreferredDuringScheduling"}'
creationTimestamp: "2024-03-20T15:48:59Z"
generation: 1
name: redhat-operators
namespace: openshift-marketplace
resourceVersion: "12217605"
uid: cc0fc420-c9d8-4c7d-997e-f0893b4c497f
spec:
displayName: Red Hat Operators
grpcPodConfig:
extractContent:
cacheDir: /tmp/cache
catalogDir: /configs
memoryTarget: 30Mi
nodeSelector:
kubernetes.io/os: linux
node-role.kubernetes.io/master: ""
priorityClassName: system-cluster-critical
securityContextConfig: restricted
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/master
operator: Exists
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 120
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 120
icon:
base64data: ""
mediatype: ""
image: registry.redhat.io/redhat/redhat-operator-index:v4.15
priority: -100
publisher: Red Hat
sourceType: grpc
updateStrategy:
registryPoll:
interval: 10m
status:
connectionState:
address: redhat-operators.openshift-marketplace.svc:50051
lastConnect: "2024-03-27T06:35:36Z"
lastObservedState: READY
latestImageRegistryPoll: "2024-03-27T10:23:16Z"
registryService:
createdAt: "2024-03-20T15:56:03Z"
port: "50051"
protocol: grpc
serviceName: redhat-operators
serviceNamespace: openshift-marketplace
Additional info:
I also checked the currentPodsWithCorrectImageAndSpec, but no hash changed due to the pod.spec are the same always.
time="2024-03-26T03:22:01Z" level=info msg="of 1 pods matching label selector, 1 have the correct images and matching hash" correctHash=true correctImages=true current-pod.name=redhat-operators-mpvzm current-pod.namespace=openshift-marketplace time="2024-03-26T03:27:01Z" level=info msg="of 1 pods matching label selector, 1 have the correct images and matching hash" catalogsource.name=redhat-operators catalogsource.namespace=openshift-marketplace correctHash=true correctImages=true current-pod.name=redhat-operators-mpvzm current-pod.namespace=openshift-marketplace id=xW0cW time="2024-03-26T03:27:01Z" level=info msg="of 1 pods matching label selector, 1 have the correct images and matching hash" catalogsource.name=redhat-operators catalogsource.namespace=openshift-marketplace correctHash=true correctImages=true current-pod.name=redhat-operators-mpvzm current-pod.namespace=openshift-marketplace id=xW0cW time="2024-03-26T03:27:02Z" level=info msg="of 1 pods matching label selector, 1 have the correct images and matching hash" catalogsource.name=redhat-operators catalogsource.namespace=openshift-marketplace correctHash=true correctImages=true current-pod.name=redhat-operators-mpvzm current-pod.namespace=openshift-marketplace id=vq5VA time="2024-03-26T03:27:03Z" level=info msg="of 1 pods matching label selector, 1 have the correct images and matching hash" catalogsource.name=redhat-operators catalogsource.namespace=openshift-marketplace correctHash=true correctImages=true current-pod.name=redhat-operators-mpvzm current-pod.namespace=openshift-marketplace id=vq5VA
- clones
-
OCPBUGS-31438 Default catalog source pod never get updates
-
- Closed
-
- is blocked by
-
OCPBUGS-31438 Default catalog source pod never get updates
-
- Closed
-
- links to
-
RHBA-2024:1887
OpenShift Container Platform 4.15.z bug fix update