-
Bug
-
Resolution: Done
-
Undefined
-
4.11.z
-
None
-
Quality / Stability / Reliability
-
False
-
-
None
-
Critical
-
None
-
None
-
None
-
Proposed
-
CNF RAN Sprint 231
-
1
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
Running TALM 4.11.3 on a 4.11 hub cluster. I have 2 clusters deployed using gitops ztp running 4.10.45. I created upgrade policies for both OCP and operators to move to 4.11. The target OCP image in the policies is 4.11.21.
I created a CGU for the upgrade with enable false.
apiVersion: ran.openshift.io/v1alpha1
kind: ClusterGroupUpgrade
metadata:
name: update-cgu
namespace: default
spec:
clusters:
- cnfdf19
- cnfdf29
enable: false
managedPolicies:
- du-upgrade-platform-upgrade-prep
- du-upgrade-platform-upgrade
- common-config-policy
- common-subscriptions-policy
preCaching: false
remediationStrategy:
maxConcurrency: 1
timeout: 360When I enabled preCaching the pod on the spoke clusters failed:
containerStatuses:
- containerID: cri-o://7e1d7a1912440bb105d8365f8ec548bfddca072c7d372e80c8deacaac0e8d3e9
image: registry.redhat.io/openshift4/topology-aware-lifecycle-manager-precache-rhel8@sha256:40249617608848518f9cd2db99f73a6f72642b28b273b1d8f34616ff1f16983b
imageID: registry.redhat.io/openshift4/topology-aware-lifecycle-manager-precache-rhel8@sha256:40249617608848518f9cd2db99f73a6f72642b28b273b1d8f34616ff1f16983b
lastState: {}
name: pre-cache-container
ready: false
restartCount: 0
started: false
state:
terminated:
containerID: cri-o://7e1d7a1912440bb105d8365f8ec548bfddca072c7d372e80c8deacaac0e8d3e9
exitCode: 139
finishedAt: "2023-01-24T22:01:31Z"
reason: Error
startedAt: "2023-01-24T22:01:31Z"
Version-Release number of selected component (if applicable):
4.11.3 Precache container version: registry.redhat.io/openshift4/topology-aware-lifecycle-manager-precache-rhel8@sha256:40249617608848518f9cd2db99f73a6f72642b28b273b1d8f34616ff1f16983b
How reproducible:
100%
Steps to Reproduce:
1. (in description) 2. 3.
Actual results:
pre-cache pod moves to error. TALM status moves to UnrecoverableError for the clusters (cnfdf19 also went to UnrecoverableError shortly after this): status: cnfdf19: Active cnfdf29: UnrecoverableError
Expected results:
Precaching succeeds
Additional info:
I used the a configmap on the hub cluster in the same namespace as the CGU to test with a newer 4.11 and 4.12 precache images. These both failed. When I used the configmap to run the latest 4.10 precache container image the precaching pods ran as expected. ConfigMap: apiVersion: v1 kind: ConfigMap metadata: name: cluster-group-upgrade-overrides namespace: default data: ## 4.10 image pushed to my quay precache.image: quay.io/imiller/testrepo1:0.10 The valid image: https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=2294306 registry-proxy.engineering.redhat.com/rh-osbs/openshift-topology-aware-lifecycle-operator-precache-rhel8@sha256:a8cb52e5c15c8a530e175ab09c75853c921509ecf3eed707dc7d307ebdaf73cd
- clones
-
OCPBUGS-6623 TALM 4.11 pre-cache fails on 4.10 cluster
-
- Closed
-
- is cloned by
-
OCPBUGS-6769 TALM 4.11 pre-cache fails on 4.10 cluster
-
- Closed
-