-
Bug
-
Resolution: Unresolved
-
Critical
-
None
-
odf-4.14
-
None
Description of problem (please be detailed as possible and provide log
snippests):
Users are unable to mount pods to volumes. From the CSI plugin logs, they are filled with corrupt mount points:
./csi-cephfsplugin-wlcpp/csi-cephfsplugin/csi-cephfsplugin/logs/rotated/0.log.20240918-163004:2024-09-16T22:59:02.180155798+00:00 stderr F W0916 22:59:02.180147 1 nodeserver.go:700] ID: 2024 corrupted mount detected in "/var/lib/kubelet/pods/7c1e6974-a1f9-4069-9129-b5e935b3f5a7/volumes/kubernetes.io~csi/pvc-abbd6725-bd49-4cd3-a9fe-3bc9320d53b0/mount": stat /var/lib/kubelet/pods/7c1e6974-a1f9-4069-9129-b5e935b3f5a7/volumes/kubernetes.io~csi/pvc-abbd6725-bd49-4cd3-a9fe-3bc9320d53b0/mount: permission denied
./csi-cephfsplugin-wlcpp/csi-cephfsplugin/csi-cephfsplugin/logs/rotated/0.log.20240918-163004:2024-09-16T23:00:15.587635229+00:00 stderr F W0916 23:00:15.587625 1 nodeserver.go:700] ID: 2052 corrupted mount detected in "/var/lib/kubelet/pods/7c1e6974-a1f9-4069-9129-b5e935b3f5a7/volumes/kubernetes.io~csi/pvc-abbd6725-bd49-4cd3-a9fe-3bc9320d53b0/mount": stat /var/lib/kubelet/pods/7c1e6974-a1f9-4069-9129-b5e935b3f5a7/volumes/kubernetes.io~csi/pvc-abbd6725-bd49-4cd3-a9fe-3bc9320d53b0/mount: permission denied
./csi-cephfsplugin-wlcpp/csi-cephfsplugin/csi-cephfsplugin/logs/rotated/0.log.20240918-163004:2024-09-16T23:02:11.816256082+00:00 stderr F W0916 23:02:11.816245 1 nodeserver.go:700] ID: 2088 corrupted mount detected in "/var/lib/kubelet/pods/7c1e6974-a1f9-4069-9129-b5e935b3f5a7/volumes/kubernetes.io~csi/pvc-abbd6725-bd49-4cd3-a9fe-3bc9320d53b0/mount": stat /var/lib/kubelet/pods/7c1e6974-a1f9-4069-9129-b5e935b3f5a7/volumes/kubernetes.io~csi/pvc-abbd6725-bd49-4cd3-a9fe-3bc9320d53b0/mount: permission denied
./csi-cephfsplugin-wlcpp/csi-cephfsplugin/csi-cephfsplugin/logs/rotated/0.log.20240918-163004:2024-09-16T23:04:02.943896343+00:00 stderr F W0916 23:04:02.943885 1 nodeserver.go:700] ID: 2130 corrupted mount detected in "/var/lib/kubelet/pods/7c1e6974-a1f9-4069-9129-b5e935b3f5a7/volumes/kubernetes.io~csi/pvc-abbd6725-bd49-4cd3-a9fe-3bc9320d53b0/mount": stat /var/lib/kubelet/pods/7c1e6974-a1f9-4069-9129-b5e935b3f5a7/volumes/kubernetes.io~csi/pvc-abbd6725-bd49-4cd3-a9fe-3bc9320d53b0/mount: permission denied
./csi-cephfsplugin-wlcpp/csi-cephfsplugin/csi-cephfsplugin/logs/rotated/0.log.20240918-163004:2024-09-16T23:08:09.885178230+00:00 stderr F W0916 23:08:09.885085 1 nodeserver.go:700] ID: 2190 corrupted mount detected in "/var/lib/kubelet/pods/345e5364-0e16-4fdd-b9de-3076c2728e79/volumes/kubernetes.io~csi/pvc-ab940a15-d40b-4f2c-9ee8-462e0b6c354d/mount": stat /var/lib/kubelet/pods/345e5364-0e16-4fdd-b9de-3076c2728e79/volumes/kubernetes.io~csi/pvc-ab940a15-d40b-4f2c-9ee8-462e0b6c354d/mount: permission denied
./csi-cephfsplugin-wlcpp/csi-cephfsplugin/csi-cephfsplugin/logs/rotated/0.log.20240918-163004:2024-09-16T23:08:09.885178230+00:00 stderr F W0916 23:08:09.885122 1 nodeserver.go:700] ID: 2174 corrupted mount detected in "/var/lib/kubelet/pods/db7479a2-8676-40b1-af35-f6280aa9e64f/volumes/kubernetes.io~csi/pvc-ab940a15-d40b-4f2c-9ee8-462e0b6c354d/mount": stat /var/lib/kubelet/pods/db7479a2-8676-40b1-af35-f6280aa9e64f/volumes/kubernetes.io~csi/pvc-ab940a15-d40b-4f2c-9ee8-462e0b6c354d/mount: permission denied
If we review the csi logs further, we can see the driver handles the corruption appropriately:
2024-09-16T23:08:27.576871302+00:00 stderr F I0916 23:08:27.576841 1 utils.go:206] ID: 2198 Req-ID: 0001-0011-openshift-storage-0000000000000001-b8a4d123-6136-48d6-ba2b-501daf72301d GRPC request:
{"staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/openshift-storage.cephfs.csi.ceph.com/421bcceb78bbcca1328fec2628259a2c79c22e8b0785327ac408431cc1bed320/globalmount","volume_id":"0001-0011-openshift-storage-0000000000000001-b8a4d123-6136-48d6-ba2b-501daf72301d"}2024-09-16T23:08:27.576918060+00:00 stderr F E0916 23:08:27.576909 1 nodeserver.go:619] ID: 2198 Req-ID: 0001-0011-openshift-storage-0000000000000001-b8a4d123-6136-48d6-ba2b-501daf72301d stat failed: stat /var/lib/kubelet/plugins/kubernetes.io/csi/openshift-storage.cephfs.csi.ceph.com/421bcceb78bbcca1328fec2628259a2c79c22e8b0785327ac408431cc1bed320/globalmount: permission denied
2024-09-16T23:08:27.576924824+00:00 stderr F I0916 23:08:27.576918 1 nodeserver.go:634] ID: 2198 Req-ID: 0001-0011-openshift-storage-0000000000000001-b8a4d123-6136-48d6-ba2b-501daf72301d cephfs: detected corrupted mount in staging target path /var/lib/kubelet/plugins/kubernetes.io/csi/openshift-storage.cephfs.csi.ceph.com/421bcceb78bbcca1328fec2628259a2c79c22e8b0785327ac408431cc1bed320/globalmount, trying to unmount anyway
2024-09-16T23:08:27.599288100+00:00 stderr F I0916 23:08:27.599214 1 cephcmds.go:105] ID: 2198 Req-ID: 0001-0011-openshift-storage-0000000000000001-b8a4d123-6136-48d6-ba2b-501daf72301d command succeeded: umount [/var/lib/kubelet/plugins/kubernetes.io/csi/openshift-storage.cephfs.csi.ceph.com/421bcceb78bbcca1328fec2628259a2c79c22e8b0785327ac408431cc1bed320/globalmount --all-targets]
2024-09-16T23:08:27.599422244+00:00 stderr F I0916 23:08:27.599377 1 nodeserver.go:647] ID: 2198 Req-ID: 0001-0011-openshift-storage-0000000000000001-b8a4d123-6136-48d6-ba2b-501daf72301d cephfs: successfully unmounted volume 0001-0011-openshift-storage-0000000000000001-b8a4d123-6136-48d6-ba2b-501daf72301d from /var/lib/kubelet/plugins/kubernetes.io/csi/openshift-storage.cephfs.csi.ceph.com/421bcceb78bbcca1328fec2628259a2c79c22e8b0785327ac408431cc1bed320/globalmount
However, I don't know whats causing this corrupt mounts to occur in the first place. I'd like to get the csi teams input on this to help narrow down this issue.
Version of all relevant components (if applicable):
ODF 4.14
OCP 4.14
- external trackers