-
Bug
-
Resolution: Duplicate
-
Normal
-
None
-
4.13.0
-
No
-
False
-
-
Description of problem:
4.13 MNO: RC4 Cluster operator kube-controller-manager is degraded: GuardControllerDegraded: Missing operand on node master-0...
Version-Release number of selected component (if applicable):
How reproducible:
Tried two times, first happened. Second installed successfully
Steps to Reproduce:
1. Install 4.13.0-rc.4 ON MNO (3 node compact cluster) Bare Metal using Agent-Based-Installer 2. created ISO image with following directory tree kni-qe-31-dualstack.bak/ kni-qe-31-dualstack.bak/ ├── agent-config.yaml ├── install-config.yaml └── openshift ├── 00-clean-spare-disk.yaml ├── 00-disable-operatorhub.yaml ├── 00-kni-lso-catsrc.yaml ├── 00-kni-sriov-catsrc.yaml ├── 00-redhat-operators-catsrc.yaml ├── 98-master-etc-block-connectivity-service.yaml ├── 98-master-etc-chrony-conf.yaml ├── 98-worker-etc-chrony-conf.yaml ├── 99-masters-disable-crio-wipe.yaml ├── 99-workers-disable-crio-wipe.yaml ├── admin-user-oauth.yaml ├── admin-user-secret.yaml ├── elasticsearch-namespace.yaml ├── elasticsearch-operatorgroup.yaml ├── elasticsearch-subscription.yaml ├── load-kernel-modules-master.yaml ├── load-kernel-modules-worker.yaml ├── localstorage-namespace.yaml ├── localstorage-operatorgroup.yaml ├── localstorage-subscription.yaml ├── logging-namespace.yaml ├── logging-operatorgroup.yaml ├── logging-subscription.yaml ├── odf-namespace.yaml ├── odf-operatorgroup.yaml ├── odf-subscription.yaml ├── sriov-namespace.yaml ├── sriov-operatorgroup.yaml └── sriov-subscription.yaml INFO cluster bootstrap is complete DEBUG Still waiting for the cluster to initialize: Multiple errors are preventing progress: DEBUG * Cluster operators authentication, image-registry, ingress, insights, kube-apiserver, machine-api, monitoring, openshift-apiserver, openshift-samples, operator-lifecycle-manager-packageserver are not available DEBUG * Could not update imagestream "openshift/driver-toolkit" (582 of 841): the server is down or not responding DEBUG * Could not update oauthclient "console" (525 of 841): the server does not recognize this resource, check extension API servers DEBUG * Could not update role "openshift-console-operator/prometheus-k8s" (758 of 841): resource may have been deleted DEBUG * Could not update role "openshift-console/prometheus-k8s" (761 of 841): resource may have been deleted DEBUG Still waiting for the cluster to initialize: Working towards 4.13.0-rc.4 DEBUG Still waiting for the cluster to initialize: Working towards 4.13.0-rc.4: 576 of 841 done (68% complete) DEBUG Route found in openshift-console namespace: console DEBUG OpenShift console route is admitted DEBUG Still waiting for the cluster to initialize: Cluster operators authentication, console, monitoring are not available DEBUG Still waiting for the cluster to initialize: Cluster operators authentication, monitoring are not available DEBUG Still waiting for the cluster to initialize: Cluster operator authentication is not available DEBUG Still waiting for the cluster to initialize: Cluster operator kube-controller-manager is degraded DEBUG Still waiting for the cluster to initialize: Cluster operator kube-controller-manager is degraded
Actual results:
Must-Gather http://10.1.101.1/4.13/must-gather/kube-controller-manager-degraded.tar.gz Reprinting Cluster State: When opening a support case, bugzilla, or issue please include the following summary data along with any other requested information: ClusterID: 99c1a3fd-fc05-49fc-9d98-630f028c79ba ClusterVersion: Installing "4.13.0-rc.4" for About an hour: Error while reconciling 4.13.0-rc.4: the cluster operator kube-controller-manager is degraded ClusterOperators: clusteroperator/kube-controller-manager is degraded because GuardControllerDegraded: Missing operand on node master-0 MissingStaticPodControllerDegraded: static pod lifecycle failure - static pod: "kube-controller-manager" in namespace: "openshift-kube-controller-manager" for revision: 7 on node: "master-1" didn't show up, waited: 3m0s StaticPodsDegraded: pod/kube-controller-manager-master-1 container "cluster-policy-controller" is terminated: Completed: StaticPodsDegraded: pod/kube-controller-manager-master-1 container "kube-controller-manager" is terminated: Completed: StaticPodsDegraded: pod/kube-controller-manager-master-1 container "kube-controller-manager-cert-syncer" is terminated: Error: st *v1.ConfigMap: Get "https://localhost:6443/api/v1/namespaces/openshift-kube-controller-manager/configmaps?limit=500&resourceVersion=0": x509: certificate signed by unknown authority StaticPodsDegraded: W0426 17:42:12.106089 1 reflector.go:424] k8s.io/client-go@v0.26.1/tools/cache/reflector.go:169: failed to list *v1.Secret: Get "https://localhost:6443/api/v1/namespaces/openshift-kube-controller-manager/secrets?limit=500&resourceVersion=0": x509: certificate signed by unknown authority StaticPodsDegraded: E0426 17:42:12.106113 1 reflector.go:140] k8s.io/client-go@v0.26.1/tools/cache/reflector.go:169: Failed to watch *v1.Secret: failed to list *v1.Secret: Get "https://localhost:6443/api/v1/namespaces/openshift-kube-controller-manager/secrets?limit=500&resourceVersion=0": x509: certificate signed by unknown authority StaticPodsDegraded: W0426 17:42:44.012955 1 reflector.go:424] k8s.io/client-go@v0.26.1/tools/cache/reflector.go:169: failed to list *v1.ConfigMap: Get "https://localhost:6443/api/v1/namespaces/openshift-kube-controller-manager/configmaps?limit=500&resourceVersion=0": x509: certificate signed by unknown authority StaticPodsDegraded: E0426 17:42:44.012985 1 reflector.go:140] k8s.io/client-go@v0.26.1/tools/cache/reflector.go:169: Failed to watch *v1.ConfigMap: failed to list *v1.ConfigMap: Get "https://localhost:6443/api/v1/namespaces/openshift-kube-controller-manager/configmaps?limit=500&resourceVersion=0": x509: certificate signed by unknown authority StaticPodsDegraded: W0426 17:43:09.168250 1 reflector.go:424] k8s.io/client-go@v0.26.1/tools/cache/reflector.go:169: failed to list *v1.Secret: Get "https://localhost:6443/api/v1/namespaces/openshift-kube-controller-manager/secrets?limit=500&resourceVersion=0": x509: certificate signed by unknown authority StaticPodsDegraded: E0426 17:43:09.168282 1 reflector.go:140] k8s.io/client-go@v0.26.1/tools/cache/reflector.go:169: Failed to watch *v1.Secret: failed to list *v1.Secret: Get "https://localhost:6443/api/v1/namespaces/openshift-kube-controller-manager/secrets?limit=500&resourceVersion=0": x509: certificate signed by unknown authority StaticPodsDegraded: StaticPodsDegraded: pod/kube-controller-manager-master-1 container "kube-controller-manager-recovery-controller" is terminated: Completed:
Expected results:
Successful install
Additional info:
oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version False False 18m Error while reconciling 4.13.0-rc.4: the cluster operator kube-controller-manager is degraded [kni@registry.kni-qe-31 ocp-edge-qe-venv]$ oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE authentication 4.13.0-rc.4 True False False 23m baremetal 4.13.0-rc.4 True False False 43m cloud-controller-manager 4.13.0-rc.4 True False False 58m cloud-credential 4.13.0-rc.4 True False False 68m cluster-autoscaler 4.13.0-rc.4 True False False 43m config-operator 4.13.0-rc.4 True False False 44m console 4.13.0-rc.4 True False False 33m control-plane-machine-set 4.13.0-rc.4 True False False 44m csi-snapshot-controller 4.13.0-rc.4 True False False 44m dns 4.13.0-rc.4 True False False 43m etcd 4.13.0-rc.4 True False False 42m image-registry 4.13.0-rc.4 True False False 34m ingress 4.13.0-rc.4 True False False 37m insights 4.13.0-rc.4 True False False 37m kube-apiserver 4.13.0-rc.4 True False False 39m kube-controller-manager 4.13.0-rc.4 True True True 41m GuardControllerDegraded: Missing operand on node master-0... kube-scheduler 4.13.0-rc.4 True False False 40m kube-storage-version-migrator 4.13.0-rc.4 True False False 44m machine-api 4.13.0-rc.4 True False False 38m machine-approver 4.13.0-rc.4 True False False 43m machine-config 4.13.0-rc.4 True False False 43m marketplace 4.13.0-rc.4 True False False 43m monitoring 4.13.0-rc.4 True False False 32m network 4.13.0-rc.4 True False False 43m node-tuning 4.13.0-rc.4 True False False 43m openshift-apiserver 4.13.0-rc.4 True False False 37m openshift-controller-manager 4.13.0-rc.4 True False False 40m openshift-samples 4.13.0-rc.4 True False False 36m operator-lifecycle-manager 4.13.0-rc.4 True False False 43m operator-lifecycle-manager-catalog 4.13.0-rc.4 True False False 43m operator-lifecycle-manager-packageserver 4.13.0-rc.4 True False False 36m service-ca 4.13.0-rc.4 True False False 44m storage 4.13.0-rc.4 True False False 44m