-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
None
-
False
-
False
-
Quay Enterprise
-
Undefined
-
In relation to RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1939064#c30
and https://issues.redhat.com/browse/PROJQUAY-1743
In a scenario when Quay issues a complete-multipart request to a Ceph RGW and the RGW's 200 OK response is unable to reach back Quay because of network interruption,
Quay will keep sending complete-multipart requests to the RGW that will be responded with 500 [1]
as the complete-multipart operation has already been completed successfully and there is no outstanding relevant multipart complete operation to perform.
A possible proposal for remedy could be that when a redundant multipart-complete request is handled by RGW, it will return a more specific response in the 40x range instead of 500 (for example 400 or 404) that Quay can treat specifically in the 'Ceph Object Gateway (RADOS)' driver as previously successful multipart-completion indication (and hence no need to re-try the completion further).
Another simple option could be that instead of re-trying the complete-multipart indefinitely, after a certain (ex 3 (configurable)) consecutive number of complete-multipart attempts that result in 500 error - to abort the multipart completion attempt and retry by issuing a new create-multipart-upload request flow again for the same object.