Uploaded image for project: 'OpenShift API for Data Protection'
  1. OpenShift API for Data Protection
  2. OADP-5076

DevFix: openshift-velero-plugin panics on imagestream backup, due to a missing secret 1.4


    • 3
    • False
    • Hide


    • False
    • QE - Ack
    • ToDo
    • 0
    • 0.000
    • Very Likely
    • 0
    • Customer Escalated, Customer Facing
    • None
    • Unset
    • Unknown
    • Proposed
    • No

      Original issue was taken over by docs to doc workaround, this is to track dev fix.
      Description of problem:

      • In a context where the backup and the `BackupStorageLocation` (BSL) are managed outside the scope of the `DataProtectionApplication` (DPA)), the OADP controller (i.e. DPAReconciler) does not [0] create the relevant [0] `oadp-$BSL-registry-secret`
      • When the backup is run, the openshift-velero-plugin panics on the imagestream backup, with an `024-02-27T10:46:50.028951744Z time="2024-02-27T10:46:50Z" level=error msg="Error backing up item" backup=openshift-adp/$backupJobName error="error executing custom action (groupResource=imagestreams.image.openshift.io, namespace=$backedupNamespace, name=postgres): rpc error: code = Aborted desc = plugin panicked: runtime error: index out of range [1] with length 1, stack trace: goroutine 94 [...]

      Steps to Reproduce:

      1. Setup `backupImages: true` on the DPA
      2. Create a Backup and a BackupStorageLocation that are not managed by the OADP controller (i.e. once without the `app.kubernetes.io/component": "bsl` label)
      3. Run the backup and the velero logs for the above-mentioned panic error.

      As a workaround to avoid the panic error, the customer can:

      • Label the custom BSL with the relevant label
          $ oc label BackupStorageLocation $BSL app.kubernetes.io/component=bsl` 
      • Once the BSL is labeled, wait until the DPA reconciles: 
          - NOTE: you can force this by doing any minor change to the DPA itself
      • Once the DPA reconciles, confirm that the relevant `oadp-$BSL-registry-secret` has been created and that the right registry data has been populated into it:
          $ oc -n openshift-adp get secret/oadp-oadp-$BSL-registry-secret -o json | jq -r '.data'

      Actual results:

      • openshift-velero-plugin fails to backup imageStream and panics with a generic error:
        024-02-27T10:46:50.028951744Z time="2024-02-27T10:46:50Z" level=error msg="Error backing up item" backup=openshift-adp/$backupJobName error="error executing custom action (groupResource=imagestreams.image.openshift.io, namespace=$backedupNamespace, name=postgres): rpc error: code = Aborted desc = plugin panicked: runtime error: index out of range [1] with length 1, stack trace: goroutine 94 [...]` 

      Expected results:

      • Instead of throwing panic error, the expectation would be for the `openshift-velero-plugin` to print a human-readable error, highlighting the missing `oadp-$BSL-registry-secret`, as a requirement for the backup job to successfully run.

      Additional information:

      • This issue has been replicated with oadp-operator.v1.3.0

      [0] https://github.com/openshift/oadp-operator/blob/oadp-1.3/controllers/registry.go#L610C1-L642C4

              tkaovila@redhat.com Tiger Kaovilai
              rhn-support-rsandu Robert Sandu
              Amos Mastbaum Amos Mastbaum
              0 Vote for this issue
              4 Start watching this issue
