-
Bug
-
Resolution: Obsolete
-
Undefined
-
4.13, 4.12
-
Quality / Stability / Reliability
-
False
-
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
During debugging ocp-42855 failure, hostedcluster conditions Degraded is True
Version-Release number of selected component (if applicable):
quay.io/openshift-release-dev/ocp-release:4.12.0-rc.6-x86_64
How reproducible:
follow ocp-42855 test steps
Steps to Reproduce:
1.Create a basic hosted cluster using hypershift tool 2.check hostedcluster conditions
Actual results:
[hmx@ovpn-12-45 hypershift]$ oc get pods -n clusters-mihuanghy NAME READY STATUS RESTARTS AGE aws-ebs-csi-driver-controller-9c46694f-mqrlc 7/7 Running 0 55m aws-ebs-csi-driver-operator-5d7867bc9f-hqzd5 1/1 Running 0 55m capi-provider-6df855dbb5-tcmvq 2/2 Running 0 58m catalog-operator-7544b8d6d8-dk4hh 2/2 Running 0 57m certified-operators-catalog-7f8f6598b5-2blv4 0/1 CrashLoopBackOff 15 (4m20s ago) 57m cloud-network-config-controller-545fcfc797-mgszj 3/3 Running 0 55m cluster-api-54c7f7c477-kgvzn 1/1 Running 0 58m cluster-autoscaler-658756f99-vr2hk 1/1 Running 0 58m cluster-image-registry-operator-84d84dbc9f-zpcsq 3/3 Running 0 57m cluster-network-operator-9b6985cc8-sd7d7 1/1 Running 0 57m cluster-node-tuning-operator-65c8f6fbb9-xzpws 1/1 Running 0 57m cluster-policy-controller-b5c76cf58-b4rth 1/1 Running 0 57m cluster-storage-operator-7474f76c99-9chl7 1/1 Running 0 57m cluster-version-operator-646d97ccc9-l72m5 1/1 Running 0 57m community-operators-catalog-774fdb48fc-z6s4d 1/1 Running 0 57m control-plane-operator-5bc8c4c996-4nz8c 2/2 Running 0 58m csi-snapshot-controller-5b7d6bb685-vf8rf 1/1 Running 0 55m csi-snapshot-controller-operator-6f74db85c6-89bts 1/1 Running 0 57m csi-snapshot-webhook-57c5bd7f85-lqnwf 1/1 Running 0 55m dns-operator-767c5bbdd8-rb7fl 1/1 Running 0 57m etcd-0 2/2 Running 0 58m hosted-cluster-config-operator-88b9d49b7-2gvbt 1/1 Running 0 57m ignition-server-949d9fd8c-cgtxb 1/1 Running 0 58m ingress-operator-5c6f5d4f48-gh7fl 3/3 Running 0 57m konnectivity-agent-79c5ff9585-pqctc 1/1 Running 0 58m konnectivity-server-65956d468c-lpwfv 1/1 Running 0 58m kube-apiserver-d9f887c4b-xwdcx 5/5 Running 0 58m kube-controller-manager-64b6f757f9-6qszq 2/2 Running 0 52m kube-scheduler-58ffcdf789-fch2n 1/1 Running 0 57m machine-approver-559d66d4d6-2v64w 1/1 Running 0 58m multus-admission-controller-8695985fbc-hjtqb 2/2 Running 0 55m oauth-openshift-6b9695fc7f-pf4j6 2/2 Running 0 55m olm-operator-bf694b84-gvz6x 2/2 Running 0 57m openshift-apiserver-55c69bc497-x8bft 2/2 Running 0 52m openshift-controller-manager-8597c66d58-jb7w2 1/1 Running 0 57m openshift-oauth-apiserver-674cd6df6d-ckg55 1/1 Running 0 57m openshift-route-controller-manager-76d78f897c-9mfmj 1/1 Running 0 57m ovnkube-master-0 7/7 Running 0 55m packageserver-7988d8ddfc-wnh6l 2/2 Running 0 57m redhat-marketplace-catalog-77547cc685-hnh65 0/1 CrashLoopBackOff 15 (4m15s ago) 57m redhat-operators-catalog-7784d45f54-58lgg 1/1 Running 0 57m { "lastTransitionTime": "2022-12-31T18:45:28Z", "message": "[certified-operators-catalog deployment has 1 unavailable replicas, redhat-marketplace-catalog deployment has 1 unavailable replicas]", "observedGeneration": 3, "reason": "UnavailableReplicas", "status": "True", "type": "Degraded" },
Expected results:
Degraded is False
Additional info:
$ oc describe pod certified-operators-catalog-7f8f6598b5-2blv4 -n clusters-mihuanghy Name: certified-operators-catalog-7f8f6598b5-2blv4 Namespace: clusters-mihuanghy Priority: 100000000 Priority Class Name: hypershift-control-plane Node: ip-10-0-202-149.us-east-2.compute.internal/10.0.202.149 Start Time: Sun, 01 Jan 2023 02:47:03 +0800 Labels: app=certified-operators-catalog hypershift.openshift.io/control-plane-component=certified-operators-catalog hypershift.openshift.io/hosted-control-plane=clusters-mihuanghy olm.catalogSource=certified-operators pod-template-hash=7f8f6598b5 Annotations: hypershift.openshift.io/release-image: quay.io/openshift-release-dev/ocp-release:4.12.0-rc.6-x86_64 k8s.v1.cni.cncf.io/network-status: [{ "name": "openshift-sdn", "interface": "eth0", "ips": [ "10.131.0.38" ], "default": true, "dns": {} }] k8s.v1.cni.cncf.io/networks-status: [{ "name": "openshift-sdn", "interface": "eth0", "ips": [ "10.131.0.38" ], "default": true, "dns": {} }] openshift.io/scc: restricted-v2 seccomp.security.alpha.kubernetes.io/pod: runtime/default Status: Running IP: 10.131.0.38 IPs: IP: 10.131.0.38 Controlled By: ReplicaSet/certified-operators-catalog-7f8f6598b5 Containers: registry: Container ID: cri-o://f32b8d4c31b729c1b7deef0da622ddd661d840428aa4847968b1b2b3bf76b6cf Image: registry.redhat.io/redhat/certified-operator-index:v4.11 Image ID: registry.redhat.io/redhat/certified-operator-index@sha256:93f667597eee33b9bdbc9a61af60978b414b6f6df8e7c5f496c4298c1dfe9b62 Port: 50051/TCP Host Port: 0/TCP State: Waiting Reason: CrashLoopBackOff Last State: Terminated Reason: Error Exit Code: 1 Started: Sun, 01 Jan 2023 03:39:44 +0800 Finished: Sun, 01 Jan 2023 03:39:44 +0800 Ready: False Restart Count: 15 Requests: cpu: 10m memory: 160Mi Liveness: exec [grpc_health_probe -addr=:50051] delay=10s timeout=1s period=10s #success=1 #failure=3 Readiness: exec [grpc_health_probe -addr=:50051] delay=5s timeout=5s period=10s #success=1 #failure=3 Startup: exec [grpc_health_probe -addr=:50051] delay=0s timeout=1s period=10s #success=1 #failure=15 Environment: <none> Mounts: <none> Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: <none> QoS Class: Burstable Node-Selectors: <none> Tolerations: hypershift.openshift.io/cluster=clusters-mihuanghy:NoSchedule hypershift.openshift.io/control-plane=true:NoSchedule node.kubernetes.io/memory-pressure:NoSchedule op=Exists node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 54m default-scheduler Successfully assigned clusters-mihuanghy/certified-operators-catalog-7f8f6598b5-2blv4 to ip-10-0-202-149.us-east-2.compute.internal Normal AddedInterface 53m multus Add eth0 [10.131.0.38/23] from openshift-sdn Normal Pulling 53m kubelet Pulling image "registry.redhat.io/redhat/certified-operator-index:v4.11" Normal Pulled 53m kubelet Successfully pulled image "registry.redhat.io/redhat/certified-operator-index:v4.11" in 40.628843349s Normal Pulled 52m (x3 over 53m) kubelet Container image "registry.redhat.io/redhat/certified-operator-index:v4.11" already present on machine Normal Created 52m (x4 over 53m) kubelet Created container registry Normal Started 52m (x4 over 53m) kubelet Started container registry Warning BackOff 3m59s (x256 over 53m) kubelet Back-off restarting failed container $ oc describe pod redhat-marketplace-catalog-77547cc685-hnh65 -n clusters-mihuanghy Name: redhat-marketplace-catalog-77547cc685-hnh65 Namespace: clusters-mihuanghy Priority: 100000000 Priority Class Name: hypershift-control-plane Node: ip-10-0-202-149.us-east-2.compute.internal/10.0.202.149 Start Time: Sun, 01 Jan 2023 02:47:03 +0800 Labels: app=redhat-marketplace-catalog hypershift.openshift.io/control-plane-component=redhat-marketplace-catalog hypershift.openshift.io/hosted-control-plane=clusters-mihuanghy olm.catalogSource=redhat-marketplace pod-template-hash=77547cc685 Annotations: hypershift.openshift.io/release-image: quay.io/openshift-release-dev/ocp-release:4.12.0-rc.6-x86_64 k8s.v1.cni.cncf.io/network-status: [{ "name": "openshift-sdn", "interface": "eth0", "ips": [ "10.131.0.40" ], "default": true, "dns": {} }] k8s.v1.cni.cncf.io/networks-status: [{ "name": "openshift-sdn", "interface": "eth0", "ips": [ "10.131.0.40" ], "default": true, "dns": {} }] openshift.io/scc: restricted-v2 seccomp.security.alpha.kubernetes.io/pod: runtime/default Status: Running IP: 10.131.0.40 IPs: IP: 10.131.0.40 Controlled By: ReplicaSet/redhat-marketplace-catalog-77547cc685 Containers: registry: Container ID: cri-o://7afba8993dac8f1c07a2946d8b791def3b0c80ce62d5d6160770a5a9990bf922 Image: registry.redhat.io/redhat/redhat-marketplace-index:v4.11 Image ID: registry.redhat.io/redhat/redhat-marketplace-index@sha256:074498ac11b5691ba8975e8f63fa04407ce11bb035dde0ced2f439d7a4640510 Port: 50051/TCP Host Port: 0/TCP State: Waiting Reason: CrashLoopBackOff Last State: Terminated Reason: Error Exit Code: 1 Started: Sun, 01 Jan 2023 03:39:49 +0800 Finished: Sun, 01 Jan 2023 03:39:49 +0800 Ready: False Restart Count: 15 Requests: cpu: 10m memory: 340Mi Liveness: exec [grpc_health_probe -addr=:50051] delay=10s timeout=1s period=10s #success=1 #failure=3 Readiness: exec [grpc_health_probe -addr=:50051] delay=5s timeout=5s period=10s #success=1 #failure=3 Startup: exec [grpc_health_probe -addr=:50051] delay=0s timeout=1s period=10s #success=1 #failure=15 Environment: <none> Mounts: <none> Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: <none> QoS Class: Burstable Node-Selectors: <none> Tolerations: hypershift.openshift.io/cluster=clusters-mihuanghy:NoSchedule hypershift.openshift.io/control-plane=true:NoSchedule node.kubernetes.io/memory-pressure:NoSchedule op=Exists node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 55m default-scheduler Successfully assigned clusters-mihuanghy/redhat-marketplace-catalog-77547cc685-hnh65 to ip-10-0-202-149.us-east-2.compute.internal Normal AddedInterface 55m multus Add eth0 [10.131.0.40/23] from openshift-sdn Normal Pulling 55m kubelet Pulling image "registry.redhat.io/redhat/redhat-marketplace-index:v4.11" Normal Pulled 54m kubelet Successfully pulled image "registry.redhat.io/redhat/redhat-marketplace-index:v4.11" in 40.862526792s Normal Pulled 53m (x3 over 54m) kubelet Container image "registry.redhat.io/redhat/redhat-marketplace-index:v4.11" already present on machine Normal Created 53m (x4 over 54m) kubelet Created container registry Normal Started 53m (x4 over 54m) kubelet Started container registry Warning BackOff 21s (x276 over 54m) kubelet Back-off restarting failed container $ oc describe deployment redhat-marketplace-catalog -n clusters-mihuanghy Name: redhat-marketplace-catalog Namespace: clusters-mihuanghy CreationTimestamp: Sun, 01 Jan 2023 02:47:03 +0800 Labels: hypershift.openshift.io/managed-by=control-plane-operator Annotations: deployment.kubernetes.io/revision: 1 Selector: olm.catalogSource=redhat-marketplace Replicas: 1 desired | 1 updated | 1 total | 0 available | 1 unavailable StrategyType: RollingUpdate MinReadySeconds: 0 RollingUpdateStrategy: 25% max unavailable, 25% max surge Pod Template: Labels: app=redhat-marketplace-catalog hypershift.openshift.io/control-plane-component=redhat-marketplace-catalog hypershift.openshift.io/hosted-control-plane=clusters-mihuanghy olm.catalogSource=redhat-marketplace Annotations: hypershift.openshift.io/release-image: quay.io/openshift-release-dev/ocp-release:4.12.0-rc.6-x86_64 Containers: registry: Image: registry.redhat.io/redhat/redhat-marketplace-index:v4.11 Port: 50051/TCP Host Port: 0/TCP Requests: cpu: 10m memory: 340Mi Liveness: exec [grpc_health_probe -addr=:50051] delay=10s timeout=1s period=10s #success=1 #failure=3 Readiness: exec [grpc_health_probe -addr=:50051] delay=5s timeout=5s period=10s #success=1 #failure=3 Startup: exec [grpc_health_probe -addr=:50051] delay=0s timeout=1s period=10s #success=1 #failure=15 Environment: <none> Mounts: <none> Volumes: <none> Priority Class Name: hypershift-control-plane Conditions: Type Status Reason ---- ------ ------ Available False MinimumReplicasUnavailable Progressing False ProgressDeadlineExceeded OldReplicaSets: <none> NewReplicaSet: redhat-marketplace-catalog-77547cc685 (1/1 replicas created) Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal ScalingReplicaSet 22m deployment-controller Scaled up replica set redhat-marketplace-catalog-77547cc685 to 1 [hmx@ovpn-12-45 hypershift]$ oc get hostedcluster -A NAMESPACE NAME VERSION KUBECONFIG PROGRESS AVAILABLE PROGRESSING MESSAGE clusters mihuanghy 4.12.0-rc.6 mihuanghy-admin-kubeconfig Completed True False The hosted control plane is available $ oc describe deployment certified-operators-catalog -n clusters-mihuanghy Name: certified-operators-catalog Namespace: clusters-mihuanghy CreationTimestamp: Sun, 01 Jan 2023 02:47:03 +0800 Labels: hypershift.openshift.io/managed-by=control-plane-operator Annotations: deployment.kubernetes.io/revision: 1 Selector: olm.catalogSource=certified-operators Replicas: 1 desired | 1 updated | 1 total | 0 available | 1 unavailable StrategyType: RollingUpdate MinReadySeconds: 0 RollingUpdateStrategy: 25% max unavailable, 25% max surge Pod Template: Labels: app=certified-operators-catalog hypershift.openshift.io/control-plane-component=certified-operators-catalog hypershift.openshift.io/hosted-control-plane=clusters-mihuanghy olm.catalogSource=certified-operators Annotations: hypershift.openshift.io/release-image: quay.io/openshift-release-dev/ocp-release:4.12.0-rc.6-x86_64 Containers: registry: Image: registry.redhat.io/redhat/certified-operator-index:v4.11 Port: 50051/TCP Host Port: 0/TCP Requests: cpu: 10m memory: 160Mi Liveness: exec [grpc_health_probe -addr=:50051] delay=10s timeout=1s period=10s #success=1 #failure=3 Readiness: exec [grpc_health_probe -addr=:50051] delay=5s timeout=5s period=10s #success=1 #failure=3 Startup: exec [grpc_health_probe -addr=:50051] delay=0s timeout=1s period=10s #success=1 #failure=15 Environment: <none> Mounts: <none> Volumes: <none> Priority Class Name: hypershift-control-plane Conditions: Type Status Reason ---- ------ ------ Available False MinimumReplicasUnavailable Progressing False ProgressDeadlineExceeded OldReplicaSets: <none> NewReplicaSet: certified-operators-catalog-7f8f6598b5 (1/1 replicas created) Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal ScalingReplicaSet 21m deployment-controller Scaled up replica set certified-operators-catalog-7f8f6598b5 to 1