Uploaded image for project: 'OpenShift API for Data Protection'
  1. OpenShift API for Data Protection
  2. OADP-4226

Restore partially fails when namespace is excluded from the backup

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Normal Normal
    • OADP 1.5.0
    • OADP 1.4.0
    • kopia, restic
    • 4
    • False
    • Hide

      None

      Show
      None
    • False
    • ToDo
    • 0
    • 0.0
    • Very Likely
    • 0
    • None
    • Unset
    • Unknown
    • Yes

      Description of problem:

      ExcludedClusterScopedResources tests started failing in OADP 1.4.  Restore partially fails when the namespace is excluded from the backup.  This issue only happens when the namespace is excluded from the backup. Backup and restore working fine if we remove the excludedClusterScopedResources field. 

      Attached velero error logs below:- 

      ./velero restore logs test-restore1  | grep error
      time="2024-06-05T10:15:55Z" level=error msg="error restoring mysql-845bdd7d8d-xr2gd: pods \"mysql-845bdd7d8d-xr2gd\" is forbidden: violates PodSecurity \"restricted:v1.24\": allowPrivilegeEscalation != false (container \"restore-wait\" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container \"restore-wait\" must set securityContext.capabilities.drop=[\"ALL\"]), runAsNonRoot != true (pod or container \"restore-wait\" must set securityContext.runAsNonRoot=true)" logSource="pkg/restore/restore.go:1641" restore=openshift-adp/test-restore1
      time="2024-06-05T10:15:55Z" level=error msg="Namespace test, resource restore error: error restoring pods/test/mysql-845bdd7d8d-xr2gd: pods \"mysql-845bdd7d8d-xr2gd\" is forbidden: violates PodSecurity \"restricted:v1.24\": allowPrivilegeEscalation != false (container \"restore-wait\" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container \"restore-wait\" must set securityContext.capabilities.drop=[\"ALL\"]), runAsNonRoot != true (pod or container \"restore-wait\" must set securityContext.runAsNonRoot=true)" logSource="pkg/controller/restore_controller.go:591" restore=openshift-adp/test-restore1

       

      Version-Release number of selected component (if applicable): 
      OADP 1.4 (Installed from oadp-1.4 branch)
      OCP 4.16

       

      How reproducible:
      Always

       

      Steps to Reproduce:
      1. Create DPA with nodeAgent enabled.

      $ oc get dpa ts-dpa -o yaml
      apiVersion: oadp.openshift.io/v1alpha1
      kind: DataProtectionApplication
      metadata:
        creationTimestamp: "2024-06-05T10:08:22Z"
        generation: 1
        name: ts-dpa
        namespace: openshift-adp
        resourceVersion: "129009"
        uid: b5ccc388-c3a6-4d11-ac5d-4e5a34b698d2
      spec:
        backupLocations:
        - velero:
            credential:
              key: cloud
              name: cloud-credentials-gcp
            default: true
            objectStorage:
              bucket: oadp82611s6ntt
              prefix: velero-e2e-8219df39-231d-11ef-98fa-845cf3eff33a
            provider: gcp
        configuration:
          nodeAgent:
            enable: true
            uploaderType: kopia
          velero:
            defaultPlugins:
            - openshift
            - gcp
            - kubevirt
      status:
        conditions:
        - lastTransitionTime: "2024-06-05T10:08:22Z"
          message: Reconcile complete
          reason: Complete
          status: "True"
          type: Reconciled

      2. Deploy a stateful application
      3. Create filesystem backup of namespace, wait until it gets completed successfully. 

      oc get backup test-backup1 -o yaml
      apiVersion: velero.io/v1
      kind: Backup
      metadata:
        annotations:
          velero.io/resource-timeout: 10m0s
          velero.io/source-cluster-k8s-gitversion: v1.29.5+87992f4
          velero.io/source-cluster-k8s-major-version: "1"
          velero.io/source-cluster-k8s-minor-version: "29"
        creationTimestamp: "2024-06-05T10:14:09Z"
        generation: 6
        labels:
          velero.io/storage-location: ts-dpa-1
        name: test-backup1
        namespace: openshift-adp
        resourceVersion: "131520"
        uid: bde0c569-4d1a-4a55-8008-a520f0fdddbf
      spec:
        csiSnapshotTimeout: 10m0s
        defaultVolumesToFsBackup: true
        excludedClusterScopedResources:
        - namespaces
        includedNamespaces:
        - test
        itemOperationTimeout: 4h0m0s
        snapshotMoveData: false
        storageLocation: ts-dpa-1
        ttl: 720h0m0s
      status:
        completionTimestamp: "2024-06-05T10:14:42Z"
        expiration: "2024-07-05T10:14:09Z"
        formatVersion: 1.1.0
        hookStatus: {}
        phase: Completed
        progress:
          itemsBackedUp: 46
          totalItems: 46
        startTimestamp: "2024-06-05T10:14:09Z"
        version: 1

      4. Delete app namespace and trigger restore. 

      $ oc get restore test-restore1 -o yaml
      apiVersion: velero.io/v1
      kind: Restore
      metadata:
        creationTimestamp: "2024-06-05T10:15:52Z"
        finalizers:
        - restores.velero.io/external-resources-finalizer
        generation: 7
        name: test-restore1
        namespace: openshift-adp
        resourceVersion: "132203"
        uid: bc2c5a16-18c3-47ea-a9a4-b05acf4f696e
      spec:
        backupName: test-backup1
        excludedResources:
        - nodes
        - events
        - events.events.k8s.io
        - backups.velero.io
        - restores.velero.io
        - resticrepositories.velero.io
        - csinodes.storage.k8s.io
        - volumeattachments.storage.k8s.io
        - backuprepositories.velero.io
        itemOperationTimeout: 4h0m0s
      status:
        completionTimestamp: "2024-06-05T10:16:06Z"
        errors: 1
        hookStatus: {}
        phase: PartiallyFailed
        progress:
          itemsRestored: 30
          totalItems: 30
        startTimestamp: "2024-06-05T10:15:52Z"
        warnings: 8

      Actual results:

      Restore is partially failing when namespace resource is excluded from backup. 

       

      Expected results:
      Restore should be successful. 

       

      Additional info:
      Attached restore logs below. 

            wnstb Wes Hayutin
            rhn-support-prajoshi Prasad Joshi
            Prasad Joshi Prasad Joshi
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: