-
Bug
-
Resolution: Done-Errata
-
Blocker
-
OADP 1.4.0
-
False
-
-
False
-
oadp-operator-bundle-container-1.4.0-4
-
ToDo
-
-
-
0
-
0
-
Very Likely
-
0
-
None
-
Unset
-
Unknown
-
Yes
Description of problem:
VSL restores are failing in OADP 1.4.0 for the AWS provider. For other provider such GCP and Azure we haven't seen any failures related to VSL. This is a regression as this tests was passing in OADP 1.3.1.
Attached error log below.
$ oc logs velero-659fb4cd5c-x7fwt | grep error Defaulted container "velero" out of: velero, openshift-velero-plugin (init), velero-plugin-for-aws (init) time="2024-06-10T10:18:24Z" level=warning msg="Failed to set default backup storage location at server start" backupStorageLocation=default error="backupstoragelocations.velero.io \"default\" not found" logSource="/remote-source/velero/app/pkg/cmd/server/server.go:492" time="2024-06-10T10:18:24Z" level=error msg="Current BackupStorageLocations available/unavailable/unknown: 0/0/1)" controller=backup-storage-location logSource="/remote-source/velero/app/pkg/controller/backup_storage_location_controller.go:180" time="2024-06-10T10:29:21Z" level=error msg="Cluster resource restore error: error executing PVAction for persistentvolumes/pvc-271d8ad0-ccb8-4037-ac85-de297e26cdeb: rpc error: code = Unknown desc = Snapshot snap-065bfc14a17157d7e is not available, err: Snapshot has empty state" logSource="/remote-source/velero/app/pkg/controller/restore_controller.go:587" restore=openshift-adp/test-restore
Version-Release number of selected component (if applicable):
oadp-operator-bundle-container-1.4.0-1
OCP 4.16
How reproducible:
Always
Steps to Reproduce:
1. Create a DPA with snapshotLocation spec
oc get dpa ts-dpa -o yaml apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: creationTimestamp: "2024-06-10T10:18:08Z" generation: 1 name: ts-dpa namespace: openshift-adp resourceVersion: "45413" uid: 0b509046-e52e-4821-8c51-c007831ca863 spec: backupLocations: - velero: config: profile: default region: us-east-2 credential: key: cloud name: cloud-credentials default: true objectStorage: bucket: oadp83371lkrqz prefix: upgrade provider: aws configuration: velero: defaultPlugins: - openshift - aws snapshotLocations: - velero: config: profile: default region: us-east-2 provider: aws status: conditions: - lastTransitionTime: "2024-06-10T10:18:09Z" message: Reconcile complete reason: Complete status: "True" type: Reconciled
2. Deploy a stateful application
$ oc get pod -n ocp-django NAME READY STATUS RESTARTS AGE django-psql-persistent-1-build 0/1 Completed 0 7m34s django-psql-persistent-1-deploy 0/1 Completed 0 6m47s django-psql-persistent-1-h2dgp 1/1 Running 0 6m46s postgresql-1-deploy 0/1 Completed 0 7m32s postgresql-1-nkmgt 1/1 Running 0 7m30s
3. Trigger a VSL backup of ocp-django namespace.
oc get backup test-backup -o yaml apiVersion: velero.io/v1 kind: Backup metadata: annotations: velero.io/resource-timeout: 10m0s velero.io/source-cluster-k8s-gitversion: v1.29.5+f6419fb velero.io/source-cluster-k8s-major-version: "1" velero.io/source-cluster-k8s-minor-version: "29" creationTimestamp: "2024-06-10T10:21:00Z" generation: 7 labels: velero.io/storage-location: ts-dpa-1 name: test-backup namespace: openshift-adp resourceVersion: "47418" uid: f5cd2016-f5dd-45bd-a172-10aedf0a95d8 spec: csiSnapshotTimeout: 10m0s defaultVolumesToFsBackup: false includedNamespaces: - ocp-django itemOperationTimeout: 4h0m0s snapshotMoveData: false storageLocation: ts-dpa-1 ttl: 720h0m0s volumeSnapshotLocations: - ts-dpa-1 status: completionTimestamp: "2024-06-10T10:21:11Z" expiration: "2024-07-10T10:21:00Z" formatVersion: 1.1.0 hookStatus: {} phase: Completed progress: itemsBackedUp: 91 totalItems: 91 startTimestamp: "2024-06-10T10:21:00Z" version: 1 volumeSnapshotsAttempted: 1 volumeSnapshotsCompleted: 1
4. Delete app namespace and trigger restore.
oc delete ns ocp-django
namespace "ocp-django" deleted
apiVersion: velero.io/v1 kind: Restore metadata: name: test-restore namespace: openshift-adp spec: backupName: test-backup
Actual results:
Restore is partially failing with error "Snapshot has empty state"
time="2024-06-10T10:29:21Z" level=error msg="Cluster resource restore error: error executing PVAction for persistentvolumes/pvc-271d8ad0-ccb8-4037-ac85-de297e26cdeb: rpc error: code = Unknown desc = Snapshot snap-065bfc14a17157d7e is not available, err: Snapshot has empty state" logSource="/remote-source/velero/app/pkg/controller/restore_controller.go:587" restore=openshift-adp/test-restore
Expected results:
Restore should be successful.
Additional info:
Attached restore logs below:-
./velero describe backup test-backup --details Name: test-backup Namespace: openshift-adp Labels: velero.io/storage-location=ts-dpa-1 Annotations: velero.io/resource-timeout=10m0s velero.io/source-cluster-k8s-gitversion=v1.29.5+f6419fb velero.io/source-cluster-k8s-major-version=1 velero.io/source-cluster-k8s-minor-version=29 Phase: Completed Namespaces: Included: ocp-django Excluded: <none> Resources: Included: * Excluded: <none> Cluster-scoped: auto Label selector: <none> Or label selector: <none> Storage Location: ts-dpa-1 Velero-Native Snapshot PVs: auto Snapshot Move Data: false Data Mover: velero TTL: 720h0m0s CSISnapshotTimeout: 10m0s ItemOperationTimeout: 4h0m0s Hooks: <none> Backup Format Version: 1.1.0 Started: 2024-06-10 10:21:00 +0000 UTC Completed: 2024-06-10 10:21:11 +0000 UTC Expiration: 2024-07-10 10:21:00 +0000 UTC Total items to be backed up: 91 Items backed up: 91 Resource List: apps.openshift.io/v1/DeploymentConfig: - ocp-django/django-psql-persistent - ocp-django/postgresql authorization.openshift.io/v1/RoleBinding: - ocp-django/admin - ocp-django/system:deployers - ocp-django/system:image-builders - ocp-django/system:image-pullers build.openshift.io/v1/Build: - ocp-django/django-psql-persistent-1 build.openshift.io/v1/BuildConfig: - ocp-django/django-psql-persistent discovery.k8s.io/v1/EndpointSlice: - ocp-django/django-psql-persistent-wf7vf - ocp-django/postgresql-5gzzf image.openshift.io/v1/ImageStream: - ocp-django/django-psql-persistent image.openshift.io/v1/ImageStreamTag: - ocp-django/django-psql-persistent:latest image.openshift.io/v1/ImageTag: - ocp-django/django-psql-persistent:latest rbac.authorization.k8s.io/v1/RoleBinding: - ocp-django/admin - ocp-django/system:deployers - ocp-django/system:image-builders - ocp-django/system:image-pullers route.openshift.io/v1/Route: - ocp-django/django-psql-persistent template.openshift.io/v1/Template: - ocp-django/mtc-test-django-psql-persistent v1/ConfigMap: - ocp-django/django-psql-persistent-1-ca - ocp-django/django-psql-persistent-1-global-ca - ocp-django/django-psql-persistent-1-sys-config - ocp-django/kube-root-ca.crt - ocp-django/openshift-service-ca.crt v1/Endpoints: - ocp-django/django-psql-persistent - ocp-django/postgresql v1/Event: - ocp-django/django-psql-persistent-1-build.17d79d4c488b2f35 - ocp-django/django-psql-persistent-1-build.17d79d4c62069aad - ocp-django/django-psql-persistent-1-build.17d79d4c63ea357b - ocp-django/django-psql-persistent-1-build.17d79d4da8034aa2 - ocp-django/django-psql-persistent-1-build.17d79d4db314a7f2 - ocp-django/django-psql-persistent-1-build.17d79d4db41a197b - ocp-django/django-psql-persistent-1-build.17d79d4e0cbb2304 - ocp-django/django-psql-persistent-1-build.17d79d4e178838d4 - ocp-django/django-psql-persistent-1-build.17d79d4e1895861e - ocp-django/django-psql-persistent-1-build.17d79d4e48a9a82d - ocp-django/django-psql-persistent-1-build.17d79d4e5fadee82 - ocp-django/django-psql-persistent-1-build.17d79d4e611ae034 - ocp-django/django-psql-persistent-1-deploy.17d79d572ec5f247 - ocp-django/django-psql-persistent-1-deploy.17d79d57474d1f4e - ocp-django/django-psql-persistent-1-deploy.17d79d5748b2f3ac - ocp-django/django-psql-persistent-1-deploy.17d79d57507361e5 - ocp-django/django-psql-persistent-1-deploy.17d79d5751f038a3 - ocp-django/django-psql-persistent-1-h2dgp.17d79d575936847c - ocp-django/django-psql-persistent-1-h2dgp.17d79d5771fd149d - ocp-django/django-psql-persistent-1-h2dgp.17d79d5773935ef5 - ocp-django/django-psql-persistent-1-h2dgp.17d79d5a9d646921 - ocp-django/django-psql-persistent-1-h2dgp.17d79d5aa4281f03 - ocp-django/django-psql-persistent-1-h2dgp.17d79d5aa52dd90f - ocp-django/django-psql-persistent-1.17d79d4dd3076bfe - ocp-django/django-psql-persistent-1.17d79d57589d0888 - ocp-django/django-psql-persistent-1.17d79d57ce1c22ec - ocp-django/django-psql-persistent.17d79d572ccf0be7 - ocp-django/postgresql-1-deploy.17d79d4c9433b77e - ocp-django/postgresql-1-deploy.17d79d4cad02107f - ocp-django/postgresql-1-deploy.17d79d4cae47f2aa - ocp-django/postgresql-1-deploy.17d79d4d103a151e - ocp-django/postgresql-1-deploy.17d79d4d184a1bd5 - ocp-django/postgresql-1-deploy.17d79d4d1996895d - ocp-django/postgresql-1-nkmgt.17d79d4e4d221ad1 - ocp-django/postgresql-1-nkmgt.17d79d4ed8e3e5bd - ocp-django/postgresql-1-nkmgt.17d79d4f071f8217 - ocp-django/postgresql-1-nkmgt.17d79d4f08ad2307 - ocp-django/postgresql-1-nkmgt.17d79d50e8bee524 - ocp-django/postgresql-1-nkmgt.17d79d50f06da988 - ocp-django/postgresql-1-nkmgt.17d79d50f17067d9 - ocp-django/postgresql-1.17d79d4d22580378 - ocp-django/postgresql.17d79d4c67b1f99a - ocp-django/postgresql.17d79d4c91445437 - ocp-django/postgresql.17d79d4d2307077a - ocp-django/postgresql.17d79d4d230ee92e - ocp-django/postgresql.17d79d4e2efccc26 v1/Namespace: - ocp-django v1/PersistentVolume: - pvc-271d8ad0-ccb8-4037-ac85-de297e26cdeb v1/PersistentVolumeClaim: - ocp-django/postgresql v1/Pod: - ocp-django/django-psql-persistent-1-build - ocp-django/django-psql-persistent-1-deploy - ocp-django/django-psql-persistent-1-h2dgp - ocp-django/postgresql-1-deploy - ocp-django/postgresql-1-nkmgt v1/ReplicationController: - ocp-django/django-psql-persistent-1 - ocp-django/postgresql-1 v1/Secret: - ocp-django/builder-dockercfg-gk972 - ocp-django/default-dockercfg-78f5h - ocp-django/deployer-dockercfg-kp946 - ocp-django/django-psql-persistent v1/Service: - ocp-django/django-psql-persistent - ocp-django/postgresql v1/ServiceAccount: - ocp-django/builder - ocp-django/default - ocp-django/deployer Backup Volumes: Velero-Native Snapshots: pvc-271d8ad0-ccb8-4037-ac85-de297e26cdeb: Snapshot ID: snap-065bfc14a17157d7e Type: gp3 Availability Zone: us-east-2a IOPS: 0 CSI Snapshots: <none included> Pod Volume Backups: <none included> HooksAttempted: 0 HooksFailed: 0
./velero describe restore test-restore --details Name: test-restore Namespace: openshift-adp Labels: <none> Annotations: <none> Phase: PartiallyFailed (run 'velero restore logs test-restore' for more information) Total items to be restored: 42 Items restored: 42 Started: 2024-06-10 10:29:19 +0000 UTC Completed: 2024-06-10 10:29:22 +0000 UTC Warnings: Velero: <none> Cluster: <none> Namespaces: ocp-django: could not restore, ConfigMap "kube-root-ca.crt" already exists. Warning: the in-cluster version is different than the backed-up version could not restore, ConfigMap "openshift-service-ca.crt" already exists. Warning: the in-cluster version is different than the backed-up version could not restore, RoleBinding "admin" already exists. Warning: the in-cluster version is different than the backed-up version could not restore, RoleBinding "system:deployers" already exists. Warning: the in-cluster version is different than the backed-up version could not restore, RoleBinding "system:image-builders" already exists. Warning: the in-cluster version is different than the backed-up version could not restore, RoleBinding "system:image-pullers" already exists. Warning: the in-cluster version is different than the backed-up version Errors: Velero: <none> Cluster: error executing PVAction for persistentvolumes/pvc-271d8ad0-ccb8-4037-ac85-de297e26cdeb: rpc error: code = Unknown desc = Snapshot snap-065bfc14a17157d7e is not available, err: Snapshot has empty state Namespaces: <none> Backup: test-backup Namespaces: Included: all namespaces found in the backup Excluded: <none> Resources: Included: * Excluded: nodes, events, events.events.k8s.io, backups.velero.io, restores.velero.io, resticrepositories.velero.io, csinodes.storage.k8s.io, volumeattachments.storage.k8s.io, backuprepositories.velero.io Cluster-scoped: auto Namespace mappings: <none> Label selector: <none> Or label selector: <none> Restore PVs: auto CSI Snapshot Restores: <none included> Existing Resource Policy: <none> ItemOperationTimeout: 4h0m0s Preserve Service NodePorts: auto HooksAttempted: 0 HooksFailed: 0 Resource List: apps.openshift.io/v1/DeploymentConfig: - ocp-django/django-psql-persistent(created) - ocp-django/postgresql(created) authorization.openshift.io/v1/RoleBinding: - ocp-django/admin(failed) - ocp-django/system:deployers(failed) - ocp-django/system:image-builders(failed) - ocp-django/system:image-pullers(failed) build.openshift.io/v1/Build: - ocp-django/django-psql-persistent-1(skipped) build.openshift.io/v1/BuildConfig: - ocp-django/django-psql-persistent(created) discovery.k8s.io/v1/EndpointSlice: - ocp-django/django-psql-persistent-wf7vf(created) - ocp-django/postgresql-5gzzf(created) image.openshift.io/v1/ImageStream: - ocp-django/django-psql-persistent(skipped) image.openshift.io/v1/ImageStreamTag: - ocp-django/django-psql-persistent:latest(skipped) image.openshift.io/v1/ImageTag: - ocp-django/django-psql-persistent:latest(skipped) rbac.authorization.k8s.io/v1/RoleBinding: - ocp-django/admin(created) - ocp-django/system:deployers(created) - ocp-django/system:image-builders(created) - ocp-django/system:image-pullers(created) route.openshift.io/v1/Route: - ocp-django/django-psql-persistent(created) template.openshift.io/v1/Template: - ocp-django/mtc-test-django-psql-persistent(created) v1/ConfigMap: - ocp-django/django-psql-persistent-1-ca(created) - ocp-django/django-psql-persistent-1-global-ca(created) - ocp-django/django-psql-persistent-1-sys-config(created) - ocp-django/kube-root-ca.crt(failed) - ocp-django/openshift-service-ca.crt(failed) v1/Endpoints: - ocp-django/django-psql-persistent(created) - ocp-django/postgresql(created) v1/Namespace: - ocp-django(created) v1/PersistentVolume: - pvc-271d8ad0-ccb8-4037-ac85-de297e26cdeb(failed) v1/PersistentVolumeClaim: - ocp-django/postgresql(created) v1/Pod: - ocp-django/django-psql-persistent-1-h2dgp(skipped) - ocp-django/postgresql-1-nkmgt(created) v1/ReplicationController: - ocp-django/django-psql-persistent-1(skipped) - ocp-django/postgresql-1(skipped) v1/Secret: - ocp-django/builder-dockercfg-gk972(created) - ocp-django/default-dockercfg-78f5h(created) - ocp-django/deployer-dockercfg-kp946(created) - ocp-django/django-psql-persistent(created) v1/Service: - ocp-django/django-psql-persistent(created) - ocp-django/postgresql(created) v1/ServiceAccount: - ocp-django/builder(skipped) - ocp-django/default(skipped) - ocp-django/deployer(skipped)
- links to
-
RHEA-2024:132892 OpenShift API for Data Protection (OADP) 1.4.0 release
- mentioned on