Uploaded image for project: 'OpenShift API for Data Protection'
  1. OpenShift API for Data Protection
  2. OADP-178

Failed/PartiallyFailed backups hang on Kubernetes although removed from bucket

XMLWordPrintable

    • False
    • False
    • Not Required
    • OADP Sprint 215, OADP Sprint 216, OADP Sprint 217, OADP Sprint 218
    • 4
    • 0
    • 0
    • 0
    • Untriaged
    • None

      Summary of resolution: upstream docs will be updated 
      Pull Request Made PR https://github.com/vmware-tanzu/velero/pull/4729

      Description of problem:
      Failed/PartiallyFailed backups hang on Kubernetes although removed from the bucket.
      Successfully completed buckups are deleted immediately as expected (see “Object storage sync” in

      https://velero.io/docs/v1.7/how-velero-works/).

      Checked with both CSI and Restic backups.

      Example output:

      http://pastebin.test.redhat.com/1018311

      Version-Release number of selected component (if applicable): 0.5.3

      How reproducible:

      Always

      Steps to reproduce:

      1. Create a successful backup (Completed status)
      2. Empty the bucket
      • Notice that after few seconds the backup CR is deleted from the cluster as expected
      1. Create a backup that will fail or partially fail. To accomplish that, create a stateful app which uses csi driver as storageclass and enable csi as a default plugin). 
      1. Empty the bucket
      • Notice that the PartiallyFailed backup CR persists

      Expected results:

      From https://velero.io/docs/v1.7/how-velero-works/, "Object storage sync" section:
      "if a backup object exists in Kubernetes but not in object storage, it will be deleted from Kubernetes since the backup tarball no longer exists.”

              tkaovila@redhat.com Tiger Kaovilai
              mperetz@redhat.com Maya Peretz
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: