-
Bug
-
Resolution: Won't Do
-
Normal
-
None
-
4.19.0
-
None
-
Quality / Stability / Reliability
-
False
-
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
I noticed in a presubmit job the diskmaker logs have a series of repeating errors: I0314 01:50:57.882044 36668 cache.go:55] Added pv "local-pv-ea5b0e26" to cache I0314 01:50:57.882171 36668 reconcile.go:97] "Looking for released PVs to cleanup" namespace="openshift-local-storage" name="tentothirty-overlapping-twentytofifty-1-2" E0314 01:50:57.882288 36668 deleter.go:103] failed to get volume mode of path "/mnt/local-storage/tentothirty-overlapping-twentytofifty-1-2/nvme-Amazon_Elastic_Block_Store_vol0662dcb2581e2e50d": Directory check for "/mnt/local-storage/tentothirty-overlapping-twentytofifty-1-2/nvme-Amazon_Elastic_Block_Store_vol0662dcb2581e2e50d" failed: open /mnt/local-storage/tentothirty-overlapping-twentytofifty-1-2/nvme-Amazon_Elastic_Block_Store_vol0662dcb2581e2e50d: no such file or directory I0314 01:50:57.882306 36668 reconcile.go:101] "Looking for symlinks to cleanup" namespace="openshift-local-storage" name="tentothirty-overlapping-twentytofifty-1-2" 2025-03-14T01:50:57.882Z DEBUG events recorder/recorder.go:104 Error cleaning PV "local-pv-ea5b0e26": failed to get volume mode of path "/mnt/local-storage/tentothirty-overlapping-twentytofifty-1-2/nvme-Amazon_Elastic_Block_Store_vol0662dcb2581e2e50d": Directory check for "/mnt/local-storage/tentothirty-overlapping-twentytofifty-1-2/nvme-Amazon_Elastic_Block_Store_vol0662dcb2581e2e50d" failed: open /mnt/local-storage/tentothirty-overlapping-twentytofifty-1-2/nvme-Amazon_Elastic_Block_Store_vol0662dcb2581e2e50d: no such file or directory {"type": "Warning", "object": {"kind":"PersistentVolume","name":"local-pv-ea5b0e26","uid":"7442678b-0860-434a-9260-1a8dbfd19d2c","apiVersion":"v1","resourceVersion":"49089"}, "reason": "VolumeFailedDelete"} presubmit job: https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/openshift_local-storage-operator/522/pull-ci-openshift-local-storage-operator-main-e2e-operator/1900324985017208832 diskmaker log: https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/pr-logs/pull/openshift_local-storage-operator/522/pull-ci-openshift-local-storage-operator-main-e2e-operator/1900324985017208832/artifacts/e2e-operator/gather-extra/artifacts/pods/openshift-local-storage_diskmaker-manager-jvpwg_diskmaker-manager.log It fails to getVolMode here in deletePV: https://github.com/openshift/local-storage-operator/blob/646a98497445b51ce0fc3c7455ae06cb4869ec33/vendor/sigs.k8s.io/sig-storage-local-static-provisioner/pkg/deleter/deleter.go#L150-L153 The PV is released but does not have deletionTimestamp: https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/pr-logs/pull/openshift_local-storage-operator/522/pull-ci-openshift-local-storage-operator-main-e2e-operator/1900324985017208832/artifacts/e2e-operator/gather-extra/artifacts/persistentvolumes.json CleanupSymlinks() in diskmaker only processes PV's with deletionTimestamp. So how could it be that the symlink has been deleted while the PV has not? I suspect this is just an issue in the order of cleanup functions in the e2e test job, cleanupSymlinkDir should run after cleanupLVSetResources: https://github.com/openshift/local-storage-operator/blob/646a98497445b51ce0fc3c7455ae06cb4869ec33/test/e2e/localvolumeset_test.go#L99-L106 If cleanupSymlinkDir runs before the LVSet is deleted, we could get into this situation. Strictly speaking, we shouldn't even need cleanupSymlinkDir anymore after https://github.com/openshift/local-storage-operator/pull/504 -- we could try removing it from the test.
Version-Release number of selected component (if applicable):
4.19
How reproducible:
Unknown
Steps to Reproduce:
1. Run e2e-operator presubmit job 2. Review diskmaker logs
Actual results:
"no such file or directory" errors when trying to delete PV
Expected results:
PV is deleted by diskmaker successfully
Additional info:
When I noticed these errors, my branch was missing https://github.com/openshift/local-storage-operator/pull/522/commits/0b88cae58e65a6176b540713f7717e3971e8e810 -- just in case that's relevant to reproducing this.