-
Bug
-
Resolution: Not a Bug
-
Blocker
-
None
-
OADP 1.1.6
-
None
-
False
-
-
False
-
ToDo
-
-
-
0
-
0
-
Very Likely
-
0
-
None
-
Unset
-
Unknown
-
Yes
Description of problem:
After successful dataMover restore, Pod(s) enters into crashloopbackoff state in OCP 4.14.
Version-Release number of selected component (if applicable):
OCP 4.14
oadp-operator-bundle-container-1.1.6-8
volsync-product.v0.7.4 VolSync 0.7.4 volsync-product.v0.7.3 Succeeded
How reproducible:
Always
Steps to Reproduce:
1. Create a DPA with dataMover enabled
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: ts-dpa spec: backupLocations: - velero: default: true objectStorage: bucket: oadpbucket227925 prefix: velero provider: gcp configuration: velero: defaultPlugins: - gcp - openshift - csi features: dataMover: enable: true
2. Deploy ocp-django application
3. Execute backup with DataMover
$ oc get backup test-backup2 -o yaml apiVersion: velero.io/v1 kind: Backup metadata: annotations: velero.io/source-cluster-k8s-gitversion: v1.27.4+deb2c60 velero.io/source-cluster-k8s-major-version: "1" velero.io/source-cluster-k8s-minor-version: "27" creationTimestamp: "2023-08-24T11:46:03Z" generation: 7 labels: velero.io/storage-location: ts-dpa-1 name: test-backup2 namespace: openshift-adp resourceVersion: "226550" uid: 198f2998-03e8-4ad1-bc19-c1108d7f4cce spec: csiSnapshotTimeout: 10m0s defaultVolumesToRestic: false includedNamespaces: - test3 storageLocation: ts-dpa-1 ttl: 720h0m0s status: completionTimestamp: "2023-08-24T11:50:11Z" csiVolumeSnapshotsAttempted: 1 csiVolumeSnapshotsCompleted: 1 expiration: "2023-09-23T11:46:03Z" formatVersion: 1.1.0 phase: Completed progress: itemsBackedUp: 98 totalItems: 98 startTimestamp: "2023-08-24T11:46:03Z" version: 1
4. Delete app namespace
5. Execute Restore
$ oc get restore test-restore4 -o yaml apiVersion: velero.io/v1 kind: Restore metadata: creationTimestamp: "2023-08-24T11:52:56Z" generation: 10 managedFields: - apiVersion: velero.io/v1 fieldsType: FieldsV1 fieldsV1: f:spec: .: {} f:backupName: {} manager: kubectl-create operation: Update time: "2023-08-24T11:52:56Z" - apiVersion: velero.io/v1 fieldsType: FieldsV1 fieldsV1: f:spec: f:excludedResources: {} f:status: .: {} f:completionTimestamp: {} f:phase: {} f:progress: .: {} f:itemsRestored: {} f:totalItems: {} f:startTimestamp: {} f:warnings: {} manager: velero-server operation: Update time: "2023-08-24T11:54:13Z" name: test-restore4 namespace: openshift-adp resourceVersion: "229366" uid: 66dbc725-0c3d-4274-a8c8-3b2d0f4480db spec: backupName: test-backup2 excludedResources: - nodes - events - events.events.k8s.io - backups.velero.io - restores.velero.io - resticrepositories.velero.io - csinodes.storage.k8s.io - volumeattachments.storage.k8s.io status: completionTimestamp: "2023-08-24T11:54:13Z" phase: Completed progress: itemsRestored: 52 totalItems: 52 startTimestamp: "2023-08-24T11:52:56Z" warnings: 4
6. Check application pods after restore
$ oc get pod -n test3 NAME READY STATUS RESTARTS AGE django-psql-persistent-1-build 1/1 Running 0 41s django-psql-persistent-1-deploy 1/1 Running 0 41s django-psql-persistent-1-mvbnk 0/1 Running 0 38s postgresql-1-deploy 1/1 Running 0 43s postgresql-1-gbvs8 0/1 CrashLoopBackOff 1 (9s ago) 40s
Actual results:
Application pod is going into the CrashLoopBackOff.
$ oc logs -ntest3 postgresql-1-gbvs8
chmod: changing permissions of '/var/lib/pgsql/data/userdata': Operation not permitted
Expected results:
Application shouldn't go into the crashloopbackoff.
Additional info:
Permissions before taking backup
$ oc get pod -n ocp-django NAME READY STATUS RESTARTS AGE django-psql-persistent-1-build 0/1 Completed 0 7m14s django-psql-persistent-1-deploy 0/1 Completed 0 6m18s django-psql-persistent-1-r8xwh 1/1 Running 0 6m15s postgresql-1-deploy 0/1 Completed 0 7m12s postgresql-1-rmwzp 1/1 Running 0 7m9s [prajoshi@localhost OADP]$ oc rsh -n ocp-django postgresql-1-rmwzp sh-5.1$ df -hT Filesystem Type Size Used Avail Use% Mounted on overlay overlay 128G 13G 115G 10% / tmpfs tmpfs 64M 0 64M 0% /dev shm tmpfs 64M 16K 64M 1% /dev/shm tmpfs tmpfs 3.2G 62M 3.1G 2% /etc/passwd /dev/sda4 xfs 128G 13G 115G 10% /etc/hosts /dev/sdb ext4 974M 49M 910M 6% /var/lib/pgsql/data tmpfs tmpfs 512M 20K 512M 1% /run/secrets/kubernetes.io/serviceaccount tmpfs tmpfs 7.9G 0 7.9G 0% /proc/acpi tmpfs tmpfs 7.9G 0 7.9G 0% /proc/scsi tmpfs tmpfs 7.9G 0 7.9G 0% /sys/firmware sh-5.1$ ls -lh /var/lib/pgsql/data total 20K drwxrws---. 2 root 1000820000 16K Aug 24 10:18 lost+found drwx------. 20 1000820000 1000820000 4.0K Aug 24 10:18 userdata
Permissions after restore.
$ ls -lh /var/lib/pgsql/data
total 20K
drwxrws---. 2 root 1000840000 16K Aug 24 12:17 lost+found
drwxrws---. 20 1000830000 1000840000 4.0K Aug 24 11:05 userdata