-
Bug
-
Resolution: Unresolved
-
Normal
-
OADP 1.4.0
-
4
-
False
-
-
False
-
ToDo
-
-
-
0
-
0.000
-
Very Likely
-
0
-
None
-
Unset
-
Unknown
-
Yes
Description of problem:
ExcludedClusterScopedResources tests started failing in OADP 1.4. Restore partially fails when the namespace is excluded from the backup. This issue only happens when the namespace is excluded from the backup. Backup and restore working fine if we remove the excludedClusterScopedResources field.
Attached velero error logs below:-
./velero restore logs test-restore1 | grep error time="2024-06-05T10:15:55Z" level=error msg="error restoring mysql-845bdd7d8d-xr2gd: pods \"mysql-845bdd7d8d-xr2gd\" is forbidden: violates PodSecurity \"restricted:v1.24\": allowPrivilegeEscalation != false (container \"restore-wait\" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container \"restore-wait\" must set securityContext.capabilities.drop=[\"ALL\"]), runAsNonRoot != true (pod or container \"restore-wait\" must set securityContext.runAsNonRoot=true)" logSource="pkg/restore/restore.go:1641" restore=openshift-adp/test-restore1 time="2024-06-05T10:15:55Z" level=error msg="Namespace test, resource restore error: error restoring pods/test/mysql-845bdd7d8d-xr2gd: pods \"mysql-845bdd7d8d-xr2gd\" is forbidden: violates PodSecurity \"restricted:v1.24\": allowPrivilegeEscalation != false (container \"restore-wait\" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container \"restore-wait\" must set securityContext.capabilities.drop=[\"ALL\"]), runAsNonRoot != true (pod or container \"restore-wait\" must set securityContext.runAsNonRoot=true)" logSource="pkg/controller/restore_controller.go:591" restore=openshift-adp/test-restore1
Version-Release number of selected component (if applicable):
OADP 1.4 (Installed from oadp-1.4 branch)
OCP 4.16
How reproducible:
Always
Steps to Reproduce:
1. Create DPA with nodeAgent enabled.
$ oc get dpa ts-dpa -o yaml apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: creationTimestamp: "2024-06-05T10:08:22Z" generation: 1 name: ts-dpa namespace: openshift-adp resourceVersion: "129009" uid: b5ccc388-c3a6-4d11-ac5d-4e5a34b698d2 spec: backupLocations: - velero: credential: key: cloud name: cloud-credentials-gcp default: true objectStorage: bucket: oadp82611s6ntt prefix: velero-e2e-8219df39-231d-11ef-98fa-845cf3eff33a provider: gcp configuration: nodeAgent: enable: true uploaderType: kopia velero: defaultPlugins: - openshift - gcp - kubevirt status: conditions: - lastTransitionTime: "2024-06-05T10:08:22Z" message: Reconcile complete reason: Complete status: "True" type: Reconciled
2. Deploy a stateful application
3. Create filesystem backup of namespace, wait until it gets completed successfully.
oc get backup test-backup1 -o yaml apiVersion: velero.io/v1 kind: Backup metadata: annotations: velero.io/resource-timeout: 10m0s velero.io/source-cluster-k8s-gitversion: v1.29.5+87992f4 velero.io/source-cluster-k8s-major-version: "1" velero.io/source-cluster-k8s-minor-version: "29" creationTimestamp: "2024-06-05T10:14:09Z" generation: 6 labels: velero.io/storage-location: ts-dpa-1 name: test-backup1 namespace: openshift-adp resourceVersion: "131520" uid: bde0c569-4d1a-4a55-8008-a520f0fdddbf spec: csiSnapshotTimeout: 10m0s defaultVolumesToFsBackup: true excludedClusterScopedResources: - namespaces includedNamespaces: - test itemOperationTimeout: 4h0m0s snapshotMoveData: false storageLocation: ts-dpa-1 ttl: 720h0m0s status: completionTimestamp: "2024-06-05T10:14:42Z" expiration: "2024-07-05T10:14:09Z" formatVersion: 1.1.0 hookStatus: {} phase: Completed progress: itemsBackedUp: 46 totalItems: 46 startTimestamp: "2024-06-05T10:14:09Z" version: 1
4. Delete app namespace and trigger restore.
$ oc get restore test-restore1 -o yaml apiVersion: velero.io/v1 kind: Restore metadata: creationTimestamp: "2024-06-05T10:15:52Z" finalizers: - restores.velero.io/external-resources-finalizer generation: 7 name: test-restore1 namespace: openshift-adp resourceVersion: "132203" uid: bc2c5a16-18c3-47ea-a9a4-b05acf4f696e spec: backupName: test-backup1 excludedResources: - nodes - events - events.events.k8s.io - backups.velero.io - restores.velero.io - resticrepositories.velero.io - csinodes.storage.k8s.io - volumeattachments.storage.k8s.io - backuprepositories.velero.io itemOperationTimeout: 4h0m0s status: completionTimestamp: "2024-06-05T10:16:06Z" errors: 1 hookStatus: {} phase: PartiallyFailed progress: itemsRestored: 30 totalItems: 30 startTimestamp: "2024-06-05T10:15:52Z" warnings: 8
Actual results:
Restore is partially failing when namespace resource is excluded from backup.
Expected results:
Restore should be successful.
Additional info:
Attached restore logs below.