-
Bug
-
Resolution: Done-Errata
-
Critical
-
4.19
-
None
-
Quality / Stability / Reliability
-
False
-
-
None
-
Critical
-
Yes
-
None
-
Approved
-
CLOUD Sprint 267, CLOUD Sprint 268, CLOUD Sprint 269
-
3
-
In Progress
-
Release Note Not Required
-
None
-
None
-
None
-
None
-
None
Description of problem:
Azure ASH cluster install failed. azure-cloud-node-manager pod stuck in CrashLoopBackOff state
oc get pod -n openshift-cloud-controller-manager
NAME READY STATUS RESTARTS AGE
azure-cloud-controller-manager-77646f8854-87b89 1/1 Running 0 69m
azure-cloud-controller-manager-77646f8854-h8wrs 1/1 Running 0 69m
azure-cloud-node-manager-72bbx 0/1 CrashLoopBackOff 18 (2m44s ago) 69m
azure-cloud-node-manager-sskdq 0/1 CrashLoopBackOff 18 (2m ago) 69m
azure-cloud-node-manager-tm84r 0/1 CrashLoopBackOff 18 (2m6s ago) 69m
I0220 02:56:09.694591 1 nodemanager.go:512] Adding node label from cloud provider: node.kubernetes.io/instance-type=Standard_DS4_v2
E0220 02:56:09.694707 1 panic.go:262] "Observed a panic" panic="runtime error: invalid memory address or nil pointer dereference" panicGoValue="\"invalid memory address or nil pointer dereference\"" stacktrace=<
goroutine 233 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic({0x32b2158, 0x4d1c000}, {0x2877c40, 0x4c16360})
/go/src/github.com/openshift/cloud-provider-azure/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:107 +0xbc
k8s.io/apimachinery/pkg/util/runtime.handleCrash({0x32b2158, 0x4d1c000}, {0x2877c40, 0x4c16360}, {0x4d1c000, 0x0, 0x43cb25?})
/go/src/github.com/openshift/cloud-provider-azure/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:82 +0x5e
k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0xc000604380?})
/go/src/github.com/openshift/cloud-provider-azure/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:59 +0x108
panic({0x2877c40?, 0x4c16360?})
/usr/lib/golang/src/runtime/panic.go:785 +0x132
sigs.k8s.io/cloud-provider-azure/pkg/provider.(*availabilitySet).GetZoneByNodeName(0xc00017e018, {0x32b20b0?, 0x4d1c000?}, {0xc00005acd8?, 0x2?})
/go/src/github.com/openshift/cloud-provider-azure/pkg/provider/azure_standard.go:583 +0x71
sigs.k8s.io/cloud-provider-azure/pkg/provider.(*Cloud).GetZone(0xc000602308, {0x32b20b0, 0x4d1c000})
Version-Release number of selected component (if applicable):
How reproducible:
Always
Steps to Reproduce:
1. Install cluster on azure ash
2.
3.
Actual results:
Cluster install failed
Expected results:
Cluster install succeed
Additional info:
panic is here https://github.com/openshift/cloud-provider-azure/blob/main/pkg/provider/azure_standard.go#L583
seems rebase pr introduced this issue.
upstream pr https://github.com/kubernetes-sigs/cloud-provider-azure/pull/7833
- links to
-
RHEA-2024:11038
OpenShift Container Platform 4.19.z bug fix update