-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
4.21.0, 4.22.0
-
None
-
None
-
False
-
-
None
-
None
-
None
-
None
-
Rejected
-
None
-
In Progress
-
Release Note Not Required
-
None
-
None
-
None
-
None
-
None
This is a clone of issue OCPBUGS-69390. The following is the description of the original issue:
—
Description of problem:
[azure disk csi driver] static provision crashed caused by segmentation fault on Azure stack hub clusters
Version-Release number of selected component (if applicable):
4.22.0-0-2025-12-16-071206-test-ci-ln-6s71ts2-latest
How reproducible:
Always
Steps to Reproduce:
1. Create an openshift cluster on azuer stack hub.
2. Manual create a disk by azure command line.
3. Use the manualy create disk create pvc and pv comsume by pod.
oc apply -f - <<EOF
apiVersion: v1
kind: PersistentVolume
metadata:
annotations:
pv.kubernetes.io/provisioned-by: disk.csi.azure.com
name: pv-azuredisk-static
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: "" # Important: empty string for static provisioning
csi:
driver: disk.csi.azure.com
# Replace with your actual disk resource ID
volumeHandle: /subscriptions/XXXXX/resourceGroups/XXXXXX/providers/Microsoft.Compute/disks/pvc-fc003bdf-cea2-49ea-8df7-c3941dfc6a6a
volumeAttributes:
fsType: ext4
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc-azuredisk-static
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
volumeName: pv-azuredisk-static # Binds to the specific PV
storageClassName: ""
---
apiVersion: v1
kind: Pod
metadata:
name: my-app
spec:
containers:
- name: my-app
image: quay.io/openshifttest/hello-openshift@sha256:56c354e7885051b6bb4263f9faa58b2c292d44790599b7dde0e49e7c466cf339
volumeMounts:
- name: azuredisk
mountPath: /mnt/adata
volumes:
- name: azuredisk
persistentVolumeClaim:
claimName: pvc-azuredisk-static
EOF
4. Check the static provisioned volume could be read and write by workload.
Actual results:
In step 4 the pod stuck at ContainerCreating, ControllerPublishVolume failed of "panic: runtime error: invalid memory address or nil pointer dereference"
$ oc describe po/my-app
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 11m default-scheduler Successfully assigned default/my-app to pewang-1216ash-cb9jm-worker-mtcazs-5b5n8
Warning FailedAttachVolume 62s (x13 over 11m) attachdetach-controller AttachVolume.Attach failed for volume "pv-azuredisk-static" : rpc error: code = Unavailable desc = error reading from server: EOF
# The driver controller crashed
$ oc get po
NAME READY STATUS RESTARTS AGE
azure-disk-csi-driver-controller-64df65c85f-ptkjc 9/11 CrashLoopBackOff 10 (2m8s ago) 60m
azure-disk-csi-driver-controller-64df65c85f-w42nv 11/11 Running 14 (8m21s ago) 58m
azure-disk-csi-driver-node-dsdn6 4/4 Running 0 51m
azure-disk-csi-driver-node-gk448 4/4 Running 0 59m
azure-disk-csi-driver-node-jt2rd 4/4 Running 0 51m
azure-disk-csi-driver-node-nn2s7 4/4 Running 0 51m
azure-disk-csi-driver-node-pvvbj 4/4 Running 0 59m
azure-disk-csi-driver-node-vkndd 4/4 Running 0 59m
azure-disk-csi-driver-operator-b84688c66-lv6q9 1/1 Running 0 60m
$ oc logs azure-disk-csi-driver-controller-64df65c85f-w42nvDefaulted container "csi-driver" out of: csi-driver, kube-rbac-proxy-8201, csi-provisioner, provisioner-kube-rbac-proxy, csi-attacher, attacher-kube-rbac-proxy, csi-resizer, resizer-kube-rbac-proxy, csi-snapshotter, snapshotter-kube-rbac-proxy, csi-liveness-probe, azure-inject-credentials (init)I1216 08:30:44.550891 1 main.go:112] set up prometheus server on 127.0.0.1:8201I1216 08:30:44.551068 1 main.go:87] Sys info: NumCPU: 8 MAXPROC: 2W1216 08:30:44.551093 1 azuredisk.go:209] nodeid is emptyI1216 08:30:44.551106 1 azuredisk.go:230] driver userAgent: disk.csi.azure.com/v1.33.5W1216 08:30:44.551123 1 client_config.go:667] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.I1216 08:30:44.551854 1 azure_disk_utils.go:155] reading cloud config from secret ""/""I1216 08:30:44.575673 1 azure_disk_utils.go:165] InitializeCloudFromSecret: failed to get cloud config from secret ""/"": secrets "\"\"" not foundI1216 08:30:44.575694 1 azure_disk_utils.go:174] could not read cloud config from secret ""/""I1216 08:30:44.575704 1 azure_disk_utils.go:177] AZURE_CREDENTIAL_FILE env var set as /etc/kubernetes/cloud.confI1216 08:30:44.650511 1 azure.go:613] Azure cloudprovider using try backoff: retries=6, exponent=1.500000, duration=6, jitter=1.000000I1216 08:30:44.701115 1 azure.go:421] "Setting up ARM client factory for network resources" subscriptionID="de7e09c3-b59a-4c7d-9c77-439c11b92879"I1216 08:30:44.748657 1 azure.go:429] "Setting up ARM client factory for compute resources" subscriptionID="de7e09c3-b59a-4c7d-9c77-439c11b92879"I1216 08:30:44.748704 1 azuredisk.go:255] disable UseInstanceMetadata for controllerI1216 08:30:44.748714 1 azuredisk.go:267] cloud: AzureStackCloud, location: mtcazs, rg: pewang-1216ash-cb9jm-rg, VMType: standard, PrimaryScaleSetName: , PrimaryAvailabilitySetName: , DisableAvailabilitySetNodes: falseI1216 08:30:44.748724 1 azuredisk.go:270] vmssCacheTTLInSeconds: -1, listVMSSWithInstanceView: falseI1216 08:30:44.748753 1 azuredisk.go:286] set DetachOperationMinTimeoutInSeconds as 240I1216 08:30:44.749110 1 driver.go:82] Enabling controller service capability: CREATE_DELETE_VOLUMEI1216 08:30:44.749121 1 driver.go:82] Enabling controller service capability: PUBLISH_UNPUBLISH_VOLUMEI1216 08:30:44.749126 1 driver.go:82] Enabling controller service capability: CREATE_DELETE_SNAPSHOTI1216 08:30:44.749131 1 driver.go:82] Enabling controller service capability: CLONE_VOLUMEI1216 08:30:44.749136 1 driver.go:82] Enabling controller service capability: EXPAND_VOLUMEI1216 08:30:44.749141 1 driver.go:82] Enabling controller service capability: SINGLE_NODE_MULTI_WRITERI1216 08:30:44.749154 1 driver.go:82] Enabling controller service capability: MODIFY_VOLUMEI1216 08:30:44.749162 1 driver.go:101] Enabling volume access mode: SINGLE_NODE_WRITERI1216 08:30:44.749166 1 driver.go:101] Enabling volume access mode: SINGLE_NODE_READER_ONLYI1216 08:30:44.749169 1 driver.go:101] Enabling volume access mode: SINGLE_NODE_SINGLE_WRITERI1216 08:30:44.749172 1 driver.go:101] Enabling volume access mode: SINGLE_NODE_MULTI_WRITERI1216 08:30:44.749178 1 driver.go:92] Enabling node service capability: STAGE_UNSTAGE_VOLUMEI1216 08:30:44.749181 1 driver.go:92] Enabling node service capability: EXPAND_VOLUMEI1216 08:30:44.749184 1 driver.go:92] Enabling node service capability: GET_VOLUME_STATSI1216 08:30:44.749205 1 driver.go:92] Enabling node service capability: SINGLE_NODE_MULTI_WRITERI1216 08:30:44.749366 1 azuredisk.go:375]DRIVER INFORMATION:-------------------Build Date: "2025-12-12T10:56:01Z"Compiler: gcDriver Name: disk.csi.azure.comDriver Version: v1.33.5Git Commit: 05b135bf6972b8ac85f9b742008148c78e3179d3Go Version: go1.24.6 (Red Hat 1.24.6-1.el9_6) X:strictfipsruntimePlatform: linux/amd64Topology Key: topology.disk.csi.azure.com/zone
Streaming logs below:I1216 08:30:45.637930 1 utils.go:105] GRPC call: /csi.v1.Identity/GetPluginInfoI1216 08:30:45.637947 1 utils.go:106] GRPC request: {}I1216 08:30:45.638864 1 utils.go:112] GRPC response: {"name":"disk.csi.azure.com","vendor_version":"v1.33.5"}I1216 08:30:45.641098 1 utils.go:105] GRPC call: /csi.v1.Identity/GetPluginCapabilitiesI1216 08:30:45.641115 1 utils.go:106] GRPC request: {}I1216 08:30:45.641147 1 utils.go:112] GRPC response: {"capabilities":[{"Type":{"Service":{"type":1}}},{"Type":{"Service":{"type":2}}},{"Type":{"VolumeExpansion":{"type":2}}},{"Type":{"VolumeExpansion":{"type":1}}}]}I1216 08:30:45.642176 1 utils.go:105] GRPC call: /csi.v1.Controller/ControllerGetCapabilitiesI1216 08:30:45.642191 1 utils.go:106] GRPC request: {}I1216 08:30:45.642220 1 utils.go:112] GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":5}}},{"Type":{"Rpc":{"type":7}}},{"Type":{"Rpc":{"type":9}}},{"Type":{"Rpc":{"type":13}}},{"Type":{"Rpc":{"type":14}}}]}I1216 08:30:45.761178 1 utils.go:105] GRPC call: /csi.v1.Controller/ControllerPublishVolumeI1216 08:30:45.761211 1 utils.go:106] GRPC request: {"node_id":"pewang-1216ash-cb9jm-worker-mtcazs-5b5n8","volume_capability":{"AccessType":{"Mount":{}},"access_mode":{"mode":7}},"volume_context":{"fsType":"ext4"},"volume_id":"/subscriptions/de7e09c3-b59a-4c7d-9c77-439c11b92879/resourceGroups/pewang-1216ash-cb9jm-rg/providers/Microsoft.Compute/disks/pvc-fc003bdf-cea2-49ea-8df7-c3941dfc6a6a"}I1216 08:30:46.338387 1 azure_controller_common.go:592] azureDisk - found disk: lun 0 name pvc-fc003bdf-cea2-49ea-8df7-c3941dfc6a6a uri /subscriptions/de7e09c3-b59a-4c7d-9c77-439c11b92879/resourceGroups/pewang-1216ash-cb9jm-rg/providers/Microsoft.Compute/disks/pvc-fc003bdf-cea2-49ea-8df7-c3941dfc6a6aI1216 08:30:46.338417 1 controllerserver.go:646] GetDiskLun returned: <nil>. Initiating attaching volume /subscriptions/de7e09c3-b59a-4c7d-9c77-439c11b92879/resourceGroups/pewang-1216ash-cb9jm-rg/providers/Microsoft.Compute/disks/pvc-fc003bdf-cea2-49ea-8df7-c3941dfc6a6a to node pewang-1216ash-cb9jm-worker-mtcazs-5b5n8 (vmState Succeeded).I1216 08:30:46.338429 1 controllerserver.go:661] Attach operation is successful. volume /subscriptions/de7e09c3-b59a-4c7d-9c77-439c11b92879/resourceGroups/pewang-1216ash-cb9jm-rg/providers/Microsoft.Compute/disks/pvc-fc003bdf-cea2-49ea-8df7-c3941dfc6a6a is already attached to node pewang-1216ash-cb9jm-worker-mtcazs-5b5n8 at lun 0.panic: runtime error: invalid memory address or nil pointer dereference[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x22549a6]
goroutine 182 [running]:sigs.k8s.io/azuredisk-csi-driver/pkg/azureutils.InsertDiskProperties(0xc0007b9f00?, 0xc0006456e0) /go/src/github.com/openshift/azure-disk-csi-driver/pkg/azureutils/azure_disk_utils.go:751 +0xc6sigs.k8s.io/azuredisk-csi-driver/pkg/azuredisk.(*Driver).ControllerPublishVolume(0xc0002f6fc8, {0x30698b0, 0xc0007ebf20}, 0xc0002f0700) /go/src/github.com/openshift/azure-disk-csi-driver/pkg/azuredisk/controllerserver.go:712 +0x20fagithub.com/container-storage-interface/spec/lib/go/csi._Controller_ControllerPublishVolume_Handler.func1({0x30698b0?, 0xc0007ebf20?}, {0x2a7cb40?, 0xc0002f0700?}) /go/src/github.com/openshift/azure-disk-csi-driver/vendor/github.com/container-storage-interface/spec/lib/go/csi/csi_grpc.pb.go:487 +0xcbsigs.k8s.io/azuredisk-csi-driver/pkg/csi-common.LogGRPC({0x30698b0, 0xc0007ebf20}, {0x2a7cb40, 0xc0002f0700}, 0xc000588700, 0xc000737b30) /go/src/github.com/openshift/azure-disk-csi-driver/pkg/csi-common/utils.go:108 +0x409google.golang.org/grpc.getChainUnaryHandler.func1({0x30698b0?, 0xc0007ebf20?}, {0x2a7cb40?, 0xc0002f0700?}) /go/src/github.com/openshift/azure-disk-csi-driver/vendor/google.golang.org/grpc/server.go:1217 +0x153sigs.k8s.io/azuredisk-csi-driver/pkg/azuredisk.(*Driver).Run.(*ServerMetrics).UnaryServerInterceptor.UnaryServerInterceptor.func5({0x30698b0, 0xc0007ebf20}, {0x2a7cb40, 0xc0002f0700}, 0xc000588700?, 0xc0007ceec0) /go/src/github.com/openshift/azure-disk-csi-driver/vendor/github.com/grpc-ecosystem/go-grpc-middleware/v2/interceptors/server.go:22 +0x275google.golang.org/grpc.NewServer.chainUnaryServerInterceptors.chainUnaryInterceptors.func1({0x30698b0, 0xc0007ebf20}, {0x2a7cb40, 0xc0002f0700}, 0xc000588700, 0x80?) /go/src/github.com/openshift/azure-disk-csi-driver/vendor/google.golang.org/grpc/server.go:1208 +0x7cgithub.com/container-storage-interface/spec/lib/go/csi._Controller_ControllerPublishVolume_Handler({0x2c2a980, 0xc0002f6fc8}, {0x30698b0, 0xc0007ebf20}, 0xc0007b9600, 0xc00031ef60) /go/src/github.com/openshift/azure-disk-csi-driver/vendor/github.com/container-storage-interface/spec/lib/go/csi/csi_grpc.pb.go:489 +0x143google.golang.org/grpc.(*Server).processUnaryRPC(0xc0000e6400, {0x30698b0, 0xc0007ebe90}, 0xc0007bac00, 0xc0006ee4b0, 0x489a850, 0x0) /go/src/github.com/openshift/azure-disk-csi-driver/vendor/google.golang.org/grpc/server.go:1405 +0x1036google.golang.org/grpc.(*Server).handleStream(0xc0000e6400, {0x306ab98, 0xc0001d5380}, 0xc0007bac00) /go/src/github.com/openshift/azure-disk-csi-driver/vendor/google.golang.org/grpc/server.go:1815 +0xb88google.golang.org/grpc.(*Server).serveStreams.func2.1() /go/src/github.com/openshift/azure-disk-csi-driver/vendor/google.golang.org/grpc/server.go:1035 +0x7fcreated by google.golang.org/grpc.(*Server).serveStreams.func2 in goroutine 179 /go/src/github.com/openshift/azure-disk-csi-driver/vendor/google.golang.org/grpc/server.go:1046 +0x11d
Expected results:
In step 4 the pod should be Running and could read and write data to the static provision volume.
Additional info:
- clones
-
OCPBUGS-69390 [azure disk csi driver] static provision crashed caused by segmentation fault on Azure stack hub clusters
-
- MODIFIED
-
- is blocked by
-
OCPBUGS-69390 [azure disk csi driver] static provision crashed caused by segmentation fault on Azure stack hub clusters
-
- MODIFIED
-
- links to