-
Bug
-
Resolution: Done-Errata
-
Critical
-
4.19
-
None
-
Quality / Stability / Reliability
-
False
-
-
None
-
Critical
-
Yes
-
None
-
Approved
-
CLOUD Sprint 267, CLOUD Sprint 268, CLOUD Sprint 269
-
3
-
In Progress
-
Release Note Not Required
-
None
-
None
-
None
-
None
-
None
Description of problem:
Azure ASH cluster install failed. azure-cloud-node-manager pod stuck in CrashLoopBackOff state oc get pod -n openshift-cloud-controller-manager NAME READY STATUS RESTARTS AGE azure-cloud-controller-manager-77646f8854-87b89 1/1 Running 0 69m azure-cloud-controller-manager-77646f8854-h8wrs 1/1 Running 0 69m azure-cloud-node-manager-72bbx 0/1 CrashLoopBackOff 18 (2m44s ago) 69m azure-cloud-node-manager-sskdq 0/1 CrashLoopBackOff 18 (2m ago) 69m azure-cloud-node-manager-tm84r 0/1 CrashLoopBackOff 18 (2m6s ago) 69m I0220 02:56:09.694591 1 nodemanager.go:512] Adding node label from cloud provider: node.kubernetes.io/instance-type=Standard_DS4_v2 E0220 02:56:09.694707 1 panic.go:262] "Observed a panic" panic="runtime error: invalid memory address or nil pointer dereference" panicGoValue="\"invalid memory address or nil pointer dereference\"" stacktrace=< goroutine 233 [running]: k8s.io/apimachinery/pkg/util/runtime.logPanic({0x32b2158, 0x4d1c000}, {0x2877c40, 0x4c16360}) /go/src/github.com/openshift/cloud-provider-azure/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:107 +0xbc k8s.io/apimachinery/pkg/util/runtime.handleCrash({0x32b2158, 0x4d1c000}, {0x2877c40, 0x4c16360}, {0x4d1c000, 0x0, 0x43cb25?}) /go/src/github.com/openshift/cloud-provider-azure/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:82 +0x5e k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0xc000604380?}) /go/src/github.com/openshift/cloud-provider-azure/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:59 +0x108 panic({0x2877c40?, 0x4c16360?}) /usr/lib/golang/src/runtime/panic.go:785 +0x132 sigs.k8s.io/cloud-provider-azure/pkg/provider.(*availabilitySet).GetZoneByNodeName(0xc00017e018, {0x32b20b0?, 0x4d1c000?}, {0xc00005acd8?, 0x2?}) /go/src/github.com/openshift/cloud-provider-azure/pkg/provider/azure_standard.go:583 +0x71 sigs.k8s.io/cloud-provider-azure/pkg/provider.(*Cloud).GetZone(0xc000602308, {0x32b20b0, 0x4d1c000})
Version-Release number of selected component (if applicable):
How reproducible:
Always
Steps to Reproduce:
1. Install cluster on azure ash 2. 3.
Actual results:
Cluster install failed
Expected results:
Cluster install succeed
Additional info:
panic is here https://github.com/openshift/cloud-provider-azure/blob/main/pkg/provider/azure_standard.go#L583
seems rebase pr introduced this issue.
upstream pr https://github.com/kubernetes-sigs/cloud-provider-azure/pull/7833
- links to
-
RHEA-2024:11038 OpenShift Container Platform 4.19.z bug fix update