-
Bug
-
Resolution: Done-Errata
-
Major
-
4.15
-
None
-
Moderate
-
No
-
Rejected
-
False
-
Description of problem:
On Alibaba, some volume snapshot never become ready.
Version-Release number of selected component (if applicable):
4.15.0-0.nightly-2023-11-06-182702
How reproducible: sometimes
Steps to Reproduce:
- Create PVC + Pod
- Create VolumeSnapshot of the PVC
- Observe that the VolumeSnapshot never becomes "ready".
Actual results:
$ oc get volumesnapshot NAME READYTOUSE SOURCEPVC ... mysnapl587m false myclaim ...
Expected results:
The VolumeSnapshot becomes ready in ~1 minute or less (for small volumes)
Additional info:
There seems to be something odd between the external-snapshotter and the CSI driver. From the snapshotter logs:
- the external-snapshotter calls initial CreateSnapshot and gets an unready snapshot (like "readyToUse [false]").
- the snapshotter calls CreateSnapshot again and gets an error (Alibaba CSI driver has some throttling). This happens few times in sequence.
- Finally, the snapshotter calls CreateSnapshot and get unready snapshot again instead of the throttling error. At this point, the snapshotter stops and does not continue calling CreateSnapshot to get ready snapshot.
This sequence is very timing sensitive - sometimes it happens that the cloud finishes the snapshot at step 2., therefore the driver gets snapshot that is ready at step 3. and then everything works OK.
(Sorry, I lost the full logs...)