-
Bug
-
Resolution: Obsolete
-
Undefined
-
4.13, 4.12
-
Quality / Stability / Reliability
-
False
-
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
During debugging ocp-42855 failure, hostedcluster conditions Degraded is True
Version-Release number of selected component (if applicable):
quay.io/openshift-release-dev/ocp-release:4.12.0-rc.6-x86_64
How reproducible:
follow ocp-42855 test steps
Steps to Reproduce:
1.Create a basic hosted cluster using hypershift tool 2.check hostedcluster conditions
Actual results:
[hmx@ovpn-12-45 hypershift]$ oc get pods -n clusters-mihuanghy
NAME READY STATUS RESTARTS AGE
aws-ebs-csi-driver-controller-9c46694f-mqrlc 7/7 Running 0 55m
aws-ebs-csi-driver-operator-5d7867bc9f-hqzd5 1/1 Running 0 55m
capi-provider-6df855dbb5-tcmvq 2/2 Running 0 58m
catalog-operator-7544b8d6d8-dk4hh 2/2 Running 0 57m
certified-operators-catalog-7f8f6598b5-2blv4 0/1 CrashLoopBackOff 15 (4m20s ago) 57m
cloud-network-config-controller-545fcfc797-mgszj 3/3 Running 0 55m
cluster-api-54c7f7c477-kgvzn 1/1 Running 0 58m
cluster-autoscaler-658756f99-vr2hk 1/1 Running 0 58m
cluster-image-registry-operator-84d84dbc9f-zpcsq 3/3 Running 0 57m
cluster-network-operator-9b6985cc8-sd7d7 1/1 Running 0 57m
cluster-node-tuning-operator-65c8f6fbb9-xzpws 1/1 Running 0 57m
cluster-policy-controller-b5c76cf58-b4rth 1/1 Running 0 57m
cluster-storage-operator-7474f76c99-9chl7 1/1 Running 0 57m
cluster-version-operator-646d97ccc9-l72m5 1/1 Running 0 57m
community-operators-catalog-774fdb48fc-z6s4d 1/1 Running 0 57m
control-plane-operator-5bc8c4c996-4nz8c 2/2 Running 0 58m
csi-snapshot-controller-5b7d6bb685-vf8rf 1/1 Running 0 55m
csi-snapshot-controller-operator-6f74db85c6-89bts 1/1 Running 0 57m
csi-snapshot-webhook-57c5bd7f85-lqnwf 1/1 Running 0 55m
dns-operator-767c5bbdd8-rb7fl 1/1 Running 0 57m
etcd-0 2/2 Running 0 58m
hosted-cluster-config-operator-88b9d49b7-2gvbt 1/1 Running 0 57m
ignition-server-949d9fd8c-cgtxb 1/1 Running 0 58m
ingress-operator-5c6f5d4f48-gh7fl 3/3 Running 0 57m
konnectivity-agent-79c5ff9585-pqctc 1/1 Running 0 58m
konnectivity-server-65956d468c-lpwfv 1/1 Running 0 58m
kube-apiserver-d9f887c4b-xwdcx 5/5 Running 0 58m
kube-controller-manager-64b6f757f9-6qszq 2/2 Running 0 52m
kube-scheduler-58ffcdf789-fch2n 1/1 Running 0 57m
machine-approver-559d66d4d6-2v64w 1/1 Running 0 58m
multus-admission-controller-8695985fbc-hjtqb 2/2 Running 0 55m
oauth-openshift-6b9695fc7f-pf4j6 2/2 Running 0 55m
olm-operator-bf694b84-gvz6x 2/2 Running 0 57m
openshift-apiserver-55c69bc497-x8bft 2/2 Running 0 52m
openshift-controller-manager-8597c66d58-jb7w2 1/1 Running 0 57m
openshift-oauth-apiserver-674cd6df6d-ckg55 1/1 Running 0 57m
openshift-route-controller-manager-76d78f897c-9mfmj 1/1 Running 0 57m
ovnkube-master-0 7/7 Running 0 55m
packageserver-7988d8ddfc-wnh6l 2/2 Running 0 57m
redhat-marketplace-catalog-77547cc685-hnh65 0/1 CrashLoopBackOff 15 (4m15s ago) 57m
redhat-operators-catalog-7784d45f54-58lgg 1/1 Running 0 57m
{
"lastTransitionTime": "2022-12-31T18:45:28Z",
"message": "[certified-operators-catalog deployment has 1 unavailable replicas, redhat-marketplace-catalog deployment has 1 unavailable replicas]",
"observedGeneration": 3,
"reason": "UnavailableReplicas",
"status": "True",
"type": "Degraded"
},
Expected results:
Degraded is False
Additional info:
$ oc describe pod certified-operators-catalog-7f8f6598b5-2blv4 -n clusters-mihuanghy
Name: certified-operators-catalog-7f8f6598b5-2blv4
Namespace: clusters-mihuanghy
Priority: 100000000
Priority Class Name: hypershift-control-plane
Node: ip-10-0-202-149.us-east-2.compute.internal/10.0.202.149
Start Time: Sun, 01 Jan 2023 02:47:03 +0800
Labels: app=certified-operators-catalog
hypershift.openshift.io/control-plane-component=certified-operators-catalog
hypershift.openshift.io/hosted-control-plane=clusters-mihuanghy
olm.catalogSource=certified-operators
pod-template-hash=7f8f6598b5
Annotations: hypershift.openshift.io/release-image: quay.io/openshift-release-dev/ocp-release:4.12.0-rc.6-x86_64
k8s.v1.cni.cncf.io/network-status:
[{
"name": "openshift-sdn",
"interface": "eth0",
"ips": [
"10.131.0.38"
],
"default": true,
"dns": {}
}]
k8s.v1.cni.cncf.io/networks-status:
[{
"name": "openshift-sdn",
"interface": "eth0",
"ips": [
"10.131.0.38"
],
"default": true,
"dns": {}
}]
openshift.io/scc: restricted-v2
seccomp.security.alpha.kubernetes.io/pod: runtime/default
Status: Running
IP: 10.131.0.38
IPs:
IP: 10.131.0.38
Controlled By: ReplicaSet/certified-operators-catalog-7f8f6598b5
Containers:
registry:
Container ID: cri-o://f32b8d4c31b729c1b7deef0da622ddd661d840428aa4847968b1b2b3bf76b6cf
Image: registry.redhat.io/redhat/certified-operator-index:v4.11
Image ID: registry.redhat.io/redhat/certified-operator-index@sha256:93f667597eee33b9bdbc9a61af60978b414b6f6df8e7c5f496c4298c1dfe9b62
Port: 50051/TCP
Host Port: 0/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Sun, 01 Jan 2023 03:39:44 +0800
Finished: Sun, 01 Jan 2023 03:39:44 +0800
Ready: False
Restart Count: 15
Requests:
cpu: 10m
memory: 160Mi
Liveness: exec [grpc_health_probe -addr=:50051] delay=10s timeout=1s period=10s #success=1 #failure=3
Readiness: exec [grpc_health_probe -addr=:50051] delay=5s timeout=5s period=10s #success=1 #failure=3
Startup: exec [grpc_health_probe -addr=:50051] delay=0s timeout=1s period=10s #success=1 #failure=15
Environment: <none>
Mounts: <none>
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes: <none>
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: hypershift.openshift.io/cluster=clusters-mihuanghy:NoSchedule
hypershift.openshift.io/control-plane=true:NoSchedule
node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 54m default-scheduler Successfully assigned clusters-mihuanghy/certified-operators-catalog-7f8f6598b5-2blv4 to ip-10-0-202-149.us-east-2.compute.internal
Normal AddedInterface 53m multus Add eth0 [10.131.0.38/23] from openshift-sdn
Normal Pulling 53m kubelet Pulling image "registry.redhat.io/redhat/certified-operator-index:v4.11"
Normal Pulled 53m kubelet Successfully pulled image "registry.redhat.io/redhat/certified-operator-index:v4.11" in 40.628843349s
Normal Pulled 52m (x3 over 53m) kubelet Container image "registry.redhat.io/redhat/certified-operator-index:v4.11" already present on machine
Normal Created 52m (x4 over 53m) kubelet Created container registry
Normal Started 52m (x4 over 53m) kubelet Started container registry
Warning BackOff 3m59s (x256 over 53m) kubelet Back-off restarting failed container
$ oc describe pod redhat-marketplace-catalog-77547cc685-hnh65 -n clusters-mihuanghy
Name: redhat-marketplace-catalog-77547cc685-hnh65
Namespace: clusters-mihuanghy
Priority: 100000000
Priority Class Name: hypershift-control-plane
Node: ip-10-0-202-149.us-east-2.compute.internal/10.0.202.149
Start Time: Sun, 01 Jan 2023 02:47:03 +0800
Labels: app=redhat-marketplace-catalog
hypershift.openshift.io/control-plane-component=redhat-marketplace-catalog
hypershift.openshift.io/hosted-control-plane=clusters-mihuanghy
olm.catalogSource=redhat-marketplace
pod-template-hash=77547cc685
Annotations: hypershift.openshift.io/release-image: quay.io/openshift-release-dev/ocp-release:4.12.0-rc.6-x86_64
k8s.v1.cni.cncf.io/network-status:
[{
"name": "openshift-sdn",
"interface": "eth0",
"ips": [
"10.131.0.40"
],
"default": true,
"dns": {}
}]
k8s.v1.cni.cncf.io/networks-status:
[{
"name": "openshift-sdn",
"interface": "eth0",
"ips": [
"10.131.0.40"
],
"default": true,
"dns": {}
}]
openshift.io/scc: restricted-v2
seccomp.security.alpha.kubernetes.io/pod: runtime/default
Status: Running
IP: 10.131.0.40
IPs:
IP: 10.131.0.40
Controlled By: ReplicaSet/redhat-marketplace-catalog-77547cc685
Containers:
registry:
Container ID: cri-o://7afba8993dac8f1c07a2946d8b791def3b0c80ce62d5d6160770a5a9990bf922
Image: registry.redhat.io/redhat/redhat-marketplace-index:v4.11
Image ID: registry.redhat.io/redhat/redhat-marketplace-index@sha256:074498ac11b5691ba8975e8f63fa04407ce11bb035dde0ced2f439d7a4640510
Port: 50051/TCP
Host Port: 0/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Sun, 01 Jan 2023 03:39:49 +0800
Finished: Sun, 01 Jan 2023 03:39:49 +0800
Ready: False
Restart Count: 15
Requests:
cpu: 10m
memory: 340Mi
Liveness: exec [grpc_health_probe -addr=:50051] delay=10s timeout=1s period=10s #success=1 #failure=3
Readiness: exec [grpc_health_probe -addr=:50051] delay=5s timeout=5s period=10s #success=1 #failure=3
Startup: exec [grpc_health_probe -addr=:50051] delay=0s timeout=1s period=10s #success=1 #failure=15
Environment: <none>
Mounts: <none>
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes: <none>
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: hypershift.openshift.io/cluster=clusters-mihuanghy:NoSchedule
hypershift.openshift.io/control-plane=true:NoSchedule
node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 55m default-scheduler Successfully assigned clusters-mihuanghy/redhat-marketplace-catalog-77547cc685-hnh65 to ip-10-0-202-149.us-east-2.compute.internal
Normal AddedInterface 55m multus Add eth0 [10.131.0.40/23] from openshift-sdn
Normal Pulling 55m kubelet Pulling image "registry.redhat.io/redhat/redhat-marketplace-index:v4.11"
Normal Pulled 54m kubelet Successfully pulled image "registry.redhat.io/redhat/redhat-marketplace-index:v4.11" in 40.862526792s
Normal Pulled 53m (x3 over 54m) kubelet Container image "registry.redhat.io/redhat/redhat-marketplace-index:v4.11" already present on machine
Normal Created 53m (x4 over 54m) kubelet Created container registry
Normal Started 53m (x4 over 54m) kubelet Started container registry
Warning BackOff 21s (x276 over 54m) kubelet Back-off restarting failed container
$ oc describe deployment redhat-marketplace-catalog -n clusters-mihuanghy
Name: redhat-marketplace-catalog
Namespace: clusters-mihuanghy
CreationTimestamp: Sun, 01 Jan 2023 02:47:03 +0800
Labels: hypershift.openshift.io/managed-by=control-plane-operator
Annotations: deployment.kubernetes.io/revision: 1
Selector: olm.catalogSource=redhat-marketplace
Replicas: 1 desired | 1 updated | 1 total | 0 available | 1 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app=redhat-marketplace-catalog
hypershift.openshift.io/control-plane-component=redhat-marketplace-catalog
hypershift.openshift.io/hosted-control-plane=clusters-mihuanghy
olm.catalogSource=redhat-marketplace
Annotations: hypershift.openshift.io/release-image: quay.io/openshift-release-dev/ocp-release:4.12.0-rc.6-x86_64
Containers:
registry:
Image: registry.redhat.io/redhat/redhat-marketplace-index:v4.11
Port: 50051/TCP
Host Port: 0/TCP
Requests:
cpu: 10m
memory: 340Mi
Liveness: exec [grpc_health_probe -addr=:50051] delay=10s timeout=1s period=10s #success=1 #failure=3
Readiness: exec [grpc_health_probe -addr=:50051] delay=5s timeout=5s period=10s #success=1 #failure=3
Startup: exec [grpc_health_probe -addr=:50051] delay=0s timeout=1s period=10s #success=1 #failure=15
Environment: <none>
Mounts: <none>
Volumes: <none>
Priority Class Name: hypershift-control-plane
Conditions:
Type Status Reason
---- ------ ------
Available False MinimumReplicasUnavailable
Progressing False ProgressDeadlineExceeded
OldReplicaSets: <none>
NewReplicaSet: redhat-marketplace-catalog-77547cc685 (1/1 replicas created)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 22m deployment-controller Scaled up replica set redhat-marketplace-catalog-77547cc685 to 1
[hmx@ovpn-12-45 hypershift]$ oc get hostedcluster -A
NAMESPACE NAME VERSION KUBECONFIG PROGRESS AVAILABLE PROGRESSING MESSAGE
clusters mihuanghy 4.12.0-rc.6 mihuanghy-admin-kubeconfig Completed True False The hosted control plane is available
$ oc describe deployment certified-operators-catalog -n clusters-mihuanghy
Name: certified-operators-catalog
Namespace: clusters-mihuanghy
CreationTimestamp: Sun, 01 Jan 2023 02:47:03 +0800
Labels: hypershift.openshift.io/managed-by=control-plane-operator
Annotations: deployment.kubernetes.io/revision: 1
Selector: olm.catalogSource=certified-operators
Replicas: 1 desired | 1 updated | 1 total | 0 available | 1 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app=certified-operators-catalog
hypershift.openshift.io/control-plane-component=certified-operators-catalog
hypershift.openshift.io/hosted-control-plane=clusters-mihuanghy
olm.catalogSource=certified-operators
Annotations: hypershift.openshift.io/release-image: quay.io/openshift-release-dev/ocp-release:4.12.0-rc.6-x86_64
Containers:
registry:
Image: registry.redhat.io/redhat/certified-operator-index:v4.11
Port: 50051/TCP
Host Port: 0/TCP
Requests:
cpu: 10m
memory: 160Mi
Liveness: exec [grpc_health_probe -addr=:50051] delay=10s timeout=1s period=10s #success=1 #failure=3
Readiness: exec [grpc_health_probe -addr=:50051] delay=5s timeout=5s period=10s #success=1 #failure=3
Startup: exec [grpc_health_probe -addr=:50051] delay=0s timeout=1s period=10s #success=1 #failure=15
Environment: <none>
Mounts: <none>
Volumes: <none>
Priority Class Name: hypershift-control-plane
Conditions:
Type Status Reason
---- ------ ------
Available False MinimumReplicasUnavailable
Progressing False ProgressDeadlineExceeded
OldReplicaSets: <none>
NewReplicaSet: certified-operators-catalog-7f8f6598b5 (1/1 replicas created)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 21m deployment-controller Scaled up replica set certified-operators-catalog-7f8f6598b5 to 1