-
Bug
-
Resolution: Not a Bug
-
Minor
-
None
-
4.11.z
-
Quality / Stability / Reliability
-
False
-
-
None
-
None
-
None
-
None
-
None
-
Rejected
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
While deploying 3510 SNOs with ACM and ZTP. 3 out of the 3510 SNOs failed to complete install because "the cluster operator operator-lifecycle-manager has not yet successfully rolled out" # oc --kubeconfig=/root/hv-vm/sno/manifests/sno00514/kubeconfig get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version False True 2d20h Unable to apply 4.11.19: the cluster operator operator-lifecycle-manager has not yet successfully rolled out # oc --kubeconfig=/root/hv-vm/sno/manifests/sno02391/kubeconfig get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version False True 2d16h Unable to apply 4.11.19: the cluster operator operator-lifecycle-manager has not yet successfully rolled out # oc --kubeconfig=/root/hv-vm/sno/manifests/sno03377/kubeconfig get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version False True 2d15h Unable to apply 4.11.19: some cluster operators have not yet rolled out
Version-Release number of selected component (if applicable):
SNO OCP 4.11.19 ACM - 2.7.0-DOWNSTREAM-2022-12-15-04-55-07
How reproducible:
3 out of 3510 installs but represents 3 out of 6 OCP install failures.
Steps to Reproduce:
1. 2. 3.
Actual results:
Expected results:
Additional info:
# oc --kubeconfig=/root/hv-vm/sno/manifests/sno00514/kubeconfig get co
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
authentication 4.11.19 True False False 34h
baremetal 4.11.19 True False False 2d19h
cloud-controller-manager 4.11.19 True False False 2d19h
cloud-credential 4.11.19 True False False 2d20h
cluster-autoscaler 4.11.19 True False False 2d19h
config-operator 4.11.19 True False False 2d19h
console 4.11.19 True False False 2d19h
csi-snapshot-controller 4.11.19 True False False 2d19h
dns 4.11.19 True False False 2d19h
etcd 4.11.19 True False False 2d19h
image-registry 4.11.19 True False False 2d19h
ingress 4.11.19 True False False 2d19h
insights 4.11.19 True False False 46s
kube-apiserver 4.11.19 True False False 2d19h
kube-controller-manager 4.11.19 True False False 2d19h
kube-scheduler 4.11.19 True False False 2d19h
kube-storage-version-migrator 4.11.19 True False False 2d19h
machine-api 4.11.19 True False False 2d19h
machine-approver 4.11.19 True False False 2d19h
machine-config 4.11.19 True False False 2d19h
marketplace 4.11.19 True False False 2d19h
monitoring 4.11.19 True False False 2d19h
network 4.11.19 True False False 2d19h
node-tuning 4.11.19 True False False 2d19h
openshift-apiserver 4.11.19 True False False 45h
openshift-controller-manager 4.11.19 True False False 44h
openshift-samples 4.11.19 True False False 2d19h
operator-lifecycle-manager False True True 2d19h
operator-lifecycle-manager-catalog 4.11.19 True False False 2d19h
operator-lifecycle-manager-packageserver False True False 2d19h
service-ca 4.11.19 True False False 2d19h
storage 4.11.19 True False False 2d19h
# oc --kubeconfig=/root/hv-vm/sno/manifests/sno00514/kubeconfig get co operator-lifecycle-manager -o yaml
apiVersion: config.openshift.io/v1
kind: ClusterOperator
metadata:
annotations:
include.release.openshift.io/ibm-cloud-managed: "true"
include.release.openshift.io/self-managed-high-availability: "true"
creationTimestamp: "2022-12-17T00:46:21Z"
generation: 1
name: operator-lifecycle-manager
ownerReferences:
- apiVersion: config.openshift.io/v1
kind: ClusterVersion
name: version
uid: 536eb0eb-2345-4b01-8be2-6c2afff83139
resourceVersion: "12532"
uid: 7d5efc98-940c-468b-b3d6-74eb51519f24
spec: {}
status:
conditions:
- lastTransitionTime: "2022-12-17T01:17:46Z"
message: Waiting to see update 0.19.0 succeed
status: "True"
type: Progressing
- lastTransitionTime: "2022-12-17T01:17:46Z"
status: "False"
type: Available
- lastTransitionTime: "2022-12-17T01:17:46Z"
message: Waiting for updates to take effect
status: "True"
type: Degraded
- lastTransitionTime: "2022-12-17T01:17:46Z"
message: Waiting for updates to take effect
status: "False"
type: Upgradeable
extension: null
relatedObjects:
- group: operators.coreos.com
name: packageserver
namespace: openshift-operator-lifecycle-manager
resource: clusterserviceversions
# oc --kubeconfig=/root/hv-vm/sno/manifests/sno00514/kubeconfig get co operator-lifecycle-manager-packageserver -o yaml
apiVersion: config.openshift.io/v1
kind: ClusterOperator
metadata:
annotations:
include.release.openshift.io/ibm-cloud-managed: "true"
include.release.openshift.io/self-managed-high-availability: "true"
creationTimestamp: "2022-12-17T00:46:22Z"
generation: 1
name: operator-lifecycle-manager-packageserver
ownerReferences:
- apiVersion: config.openshift.io/v1
kind: ClusterVersion
name: version
uid: 536eb0eb-2345-4b01-8be2-6c2afff83139
resourceVersion: "10802"
uid: 1287defc-02a6-4279-887c-fff32f238858
spec: {}
status:
conditions:
- lastTransitionTime: "2022-12-17T01:17:38Z"
status: "False"
type: Degraded
- lastTransitionTime: "2022-12-17T01:17:38Z"
status: "False"
type: Available
- lastTransitionTime: "2022-12-17T01:17:38Z"
message: waiting for events - source=operator-lifecycle-manager-packageserver
status: "True"
type: Progressing
extension: null
# oc --kubeconfig=/root/hv-vm/sno/manifests/sno00514/kubeconfig get po -n openshift-operator-lifecycle-manager
NAME READY STATUS RESTARTS AGE
catalog-operator-6d48dd9c68-p2v9n 1/1 Running 0 2d19h
collect-profiles-27858030-rr9xs 0/1 Completed 0 37m
collect-profiles-27858045-hmvxx 0/1 Completed 0 22m
collect-profiles-27858060-jpzs5 0/1 Completed 0 7m49s
olm-operator-65b5787545-2zfjm 1/1 Running 1 (2d19h ago) 2d19h
package-server-manager-84d5f4f6c9-gxqrq 1/1 Running 2 (2d19h ago) 2d19h
# oc --kubeconfig=/root/hv-vm/sno/manifests/sno00514/kubeconfig logs -n openshift-operator-lifecycle-manager olm-operator-65b5787545-2zfjm
...
{"level":"error","ts":1671480045.2537725,"msg":"Reconciler error","controller":"clusteroperator","controllerGroup":"config.openshift.io","controllerKind":"ClusterOperator","operatorCondition":{"name":"packageserver","namespace":"openshift-operator-lifecycle-manager"},"namespace":"openshift-operator-lifecycle-manager","name":"packageserver","reconcileID":"40bb68eb-f1e5-4321-8190-1f864518d090","error":"Deployment.apps \"packageserver\" not found","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:273\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:234"}
{"level":"error","ts":1671481045.2545419,"msg":"Reconciler error","controller":"clusteroperator","controllerGroup":"config.openshift.io","controllerKind":"ClusterOperator","operatorCondition":{"name":"packageserver","namespace":"openshift-operator-lifecycle-manager"},"namespace":"openshift-operator-lifecycle-manager","name":"packageserver","reconcileID":"e30a729d-3fda-4463-ba6c-87a0888bae2a","error":"Deployment.apps \"packageserver\" not found","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:273\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:234"}
{"level":"error","ts":1671482045.2550378,"msg":"Reconciler error","controller":"clusteroperator","controllerGroup":"config.openshift.io","controllerKind":"ClusterOperator","operatorCondition":{"name":"packageserver","namespace":"openshift-operator-lifecycle-manager"},"namespace":"openshift-operator-lifecycle-manager","name":"packageserver","reconcileID":"6f033261-1944-4343-acf9-905881c2108f","error":"Deployment.apps \"packageserver\" not found","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:273\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:234"}
{"level":"error","ts":1671483045.261434,"msg":"Reconciler error","controller":"clusteroperator","controllerGroup":"config.openshift.io","controllerKind":"ClusterOperator","operatorCondition":{"name":"packageserver","namespace":"openshift-operator-lifecycle-manager"},"namespace":"openshift-operator-lifecycle-manager","name":"packageserver","reconcileID":"38d5168a-32a6-4537-9bcd-37566950d98a","error":"Deployment.apps \"packageserver\" not found","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:273\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:234"}
{"level":"error","ts":1671484045.2621686,"msg":"Reconciler error","controller":"clusteroperator","controllerGroup":"config.openshift.io","controllerKind":"ClusterOperator","operatorCondition":{"name":"packageserver","namespace":"openshift-operator-lifecycle-manager"},"namespace":"openshift-operator-lifecycle-manager","name":"packageserver","reconcileID":"8739d7e8-8e97-4c51-88f7-24047528532b","error":"Deployment.apps \"packageserver\" not found","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:273\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:234"}