-
Bug
-
Resolution: Done-Errata
-
Critical
-
4.12.z
-
Quality / Stability / Reliability
-
False
-
-
None
-
Important
-
No
-
None
-
Rejected
-
SDN Sprint 241, SDN Sprint 242
-
2
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
Azure SNO cluster installation failed due to CNCC pod crashed, found failure in ci jobs and then reproduced it with flexy job
Version-Release number of selected component (if applicable):
4.14.0-ec.4
How reproducible:
Not sure
Steps to Reproduce:
1. Install a cluster with flexy job aos-4_14/ipi-on-azure/versioned-installer-sno-ci, set networkType: "OVNKubernetes"
Actual results:
Installation failed
% oc get co
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
authentication 4.14.0-ec.4 True False False 64m
baremetal 4.14.0-ec.4 True False False 89m
cloud-controller-manager 4.14.0-ec.4 True False False 92m
cloud-credential 4.14.0-ec.4 True False False 97m
cluster-autoscaler 4.14.0-ec.4 True False False 89m
config-operator 4.14.0-ec.4 True False False 90m
console 4.14.0-ec.4 True False False 72m
control-plane-machine-set 4.14.0-ec.4 True False False 89m
csi-snapshot-controller 4.14.0-ec.4 True False False 89m
dns 4.14.0-ec.4 True False False 89m
etcd 4.14.0-ec.4 True False False 84m
image-registry 4.14.0-ec.4 True False False 75m
ingress 4.14.0-ec.4 True False False 75m
insights 4.14.0-ec.4 True False False 83m
kube-apiserver 4.14.0-ec.4 True False False 80m
kube-controller-manager 4.14.0-ec.4 True False False 83m
kube-scheduler 4.14.0-ec.4 True False False 80m
kube-storage-version-migrator 4.14.0-ec.4 True False False 90m
machine-api 4.14.0-ec.4 True False False 84m
machine-approver 4.14.0-ec.4 True False False 89m
machine-config 4.14.0-ec.4 True False False 88m
marketplace 4.14.0-ec.4 True False False 89m
monitoring 4.14.0-ec.4 True False False 70m
network 4.14.0-ec.4 True True False 92m Deployment "/openshift-cloud-network-config-controller/cloud-network-config-controller" is not available (awaiting 1 nodes)
node-tuning 4.14.0-ec.4 True False False 89m
openshift-apiserver 4.14.0-ec.4 True False False 75m
openshift-controller-manager 4.14.0-ec.4 True False False 75m
openshift-samples 4.14.0-ec.4 True False False 75m
operator-lifecycle-manager 4.14.0-ec.4 True False False 89m
operator-lifecycle-manager-catalog 4.14.0-ec.4 True False False 89m
operator-lifecycle-manager-packageserver 4.14.0-ec.4 True False False 80m
service-ca 4.14.0-ec.4 True False False 90m
storage 4.14.0-ec.4 True False False 89m
oc get pods -n openshift-cloud-network-config-controller
NAME READY STATUS RESTARTS AGE
cloud-network-config-controller-565df6f4b5-sb8kv 0/1 Error 19 (5m58s ago) 93m
% oc describe pod cloud-network-config-controller-565df6f4b5-sb8kv -n openshift-cloud-network-config-controller
Name: cloud-network-config-controller-565df6f4b5-sb8kv
Namespace: openshift-cloud-network-config-controller
Priority: 2000000000
Priority Class Name: system-cluster-critical
Service Account: cloud-network-config-controller
Node: huirwang-0828d-s424j-master-0/10.0.0.6
Start Time: Mon, 28 Aug 2023 12:57:02 +0800
Labels: app=cloud-network-config-controller
component=network
openshift.io/component=network
pod-template-hash=565df6f4b5
type=infra
Annotations: k8s.ovn.org/pod-networks:
{"default":{"ip_addresses":["10.128.0.30/23"],"mac_address":"0a:58:0a:80:00:1e","gateway_ips":["10.128.0.1"],"routes":[{"dest":"10.128.0.0...
k8s.v1.cni.cncf.io/network-status:
[{
"name": "ovn-kubernetes",
"interface": "eth0",
"ips": [
"10.128.0.30"
],
"mac": "0a:58:0a:80:00:1e",
"default": true,
"dns": {}
}]
openshift.io/scc: restricted-v2
seccomp.security.alpha.kubernetes.io/pod: runtime/default
Status: Running
IP: 10.128.0.30
IPs:
IP: 10.128.0.30
Controlled By: ReplicaSet/cloud-network-config-controller-565df6f4b5
Containers:
controller:
Container ID: cri-o://35683ef6222fac819b8cbca5a0a22b047bd8950570a4f1783f9fb515acbde6bd
Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c4d8734ad517b36d7ac0baef72c183b3183e0fa8aa4a465f691bfbf262348970
Image ID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c4d8734ad517b36d7ac0baef72c183b3183e0fa8aa4a465f691bfbf262348970
Port: <none>
Host Port: <none>
Command:
/usr/bin/cloud-network-config-controller
Args:
-platform-type
Azure
-platform-region=
-platform-api-url=
-platform-aws-ca-override=
-platform-azure-environment=AzurePublicCloud
-secret-name
cloud-credentials
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Message: W0828 06:27:53.509786 1 client_config.go:618] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
F0828 06:28:23.512457 1 main.go:345] Error building controller runtime client: Get "https://api-int.huirwang-0828d.qe.azure.devcluster.openshift.com:6443/api?timeout=32s": dial tcp 10.0.0.4:6443: i/o timeout
Exit Code: 1
Started: Mon, 28 Aug 2023 14:27:53 +0800
Finished: Mon, 28 Aug 2023 14:28:23 +0800
Ready: False
Restart Count: 19
Requests:
cpu: 10m
memory: 50Mi
Environment:
CONTROLLER_NAMESPACE: openshift-cloud-network-config-controller (v1:metadata.namespace)
CONTROLLER_NAME: cloud-network-config-controller-565df6f4b5-sb8kv (v1:metadata.name)
KUBERNETES_SERVICE_PORT: 6443
KUBERNETES_SERVICE_HOST: api-int.huirwang-0828d.qe.azure.devcluster.openshift.com
RELEASE_VERSION: 4.14.0-ec.4
Mounts:
/etc/pki/ca-trust/extracted/pem from trusted-ca (ro)
/etc/secret/cloudprovider from cloud-provider-secret (ro)
/kube-cloud-config from kube-cloud-config (ro)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-b9hp9 (ro)
/var/run/secrets/openshift/serviceaccount from bound-sa-token (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
cloud-provider-secret:
Type: Secret (a volume populated by a Secret)
SecretName: cloud-credentials
Optional: false
kube-cloud-config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: kube-cloud-config
Optional: false
trusted-ca:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: trusted-ca
Optional: false
bound-sa-token:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3600
kube-api-access-b9hp9:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
ConfigMapName: openshift-service-ca.crt
ConfigMapOptional: <nil>
QoS Class: Burstable
Node-Selectors: node-role.kubernetes.io/master=
Tolerations: node-role.kubernetes.io/master:NoSchedule op=Exists
node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 93m default-scheduler 0/1 nodes are available: 1 node(s) had untolerated taint {node.kubernetes.io/not-ready: }. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling..
Normal Scheduled 92m default-scheduler Successfully assigned openshift-cloud-network-config-controller/cloud-network-config-controller-565df6f4b5-sb8kv to huirwang-0828d-s424j-master-0
Normal AddedInterface 92m multus Add eth0 [10.128.0.30/23] from ovn-kubernetes
Normal Pulling 92m kubelet Pulling image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c4d8734ad517b36d7ac0baef72c183b3183e0fa8aa4a465f691bfbf262348970"
Normal Pulled 91m kubelet Successfully pulled image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c4d8734ad517b36d7ac0baef72c183b3183e0fa8aa4a465f691bfbf262348970" in 14.561801668s (14.561823468s including waiting)
Normal Created 85m (x5 over 91m) kubelet Created container controller
Normal Started 85m (x5 over 91m) kubelet Started container controller
Normal Pulled 6m58s (x18 over 91m) kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c4d8734ad517b36d7ac0baef72c183b3183e0fa8aa4a465f691bfbf262348970" already present on machine
Warning BackOff 114s (x332 over 91m) kubelet Back-off restarting failed container controller in pod cloud-network-config-controller-565df6f4b5-sb8kv_openshift-cloud-network-config-controller(ab850390-97a3-4fe5-83b7-1bd3c1628470
Expected results:
CNCC pod runs smoothly
Additional info:
- is blocked by
-
OCPBUGS-9972 Azure; NLB; OVN-K: Requests from CNI pods to internalAPI server domain fails intermittently
-
- Closed
-
- links to
-
RHBA-2023:5382
OpenShift Container Platform 4.13.z bug fix update