Uploaded image for project: 'OpenShift API for Data Protection'
  1. OpenShift API for Data Protection
  2. OADP-1956

CSI restore (not DM) "PartiallyFailed" , exit on "rpc error: code = Unknown desc = Failed to get Volumesnapshot"

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Won't Do
    • Icon: Major Major
    • OADP 1.2.5
    • OADP 1.2.0
    • csi-plugin
    • False
    • Hide

      None

      Show
      None
    • False
    • ToDo
    • Yes
    • 0
    • 0
    • Very Likely
    • 0
    • None
    • Unset
    • Unknown
    • Proposed

      Description of problem:

      this ticket is related to this one : 

      https://issues.redhat.com/browse/OADP-1844

      the DPA on the cluster is clean and without any data-mover plugin related or  vsm 

      reproduce again on cloud33 with  the following 

      on single ns with 1000 pods using ceph-rbd 

      error from velero: 

       [root@f07-h27-000-r640 ~]# cat /tmp/benchmark-runner-run-artifacts/oadp-2023-05-21-19-15-12/oadp-velero.log  |grep -i error
      time="2023-05-21T20:07:08Z" level=error msg="Namespace busybox-perf-single-ns-1000-pods, resource restore error: error preparing volumesnapshots.snapshot.storage.k8s.io/busybox-perf-single-ns-1000-pods/velero-pvc-busybox-perf-single-ns-1000-pods-493-26lsk: rpc error: code = Unknown desc = Volumesnapshot busybox-perf-single-ns-1000-pods/velero-pvc-busybox-perf-single-ns-1000-pods-493-26lsk does not have a velero.io/csi-volumesnapshot-handle annotation" logSource="/remote-source/velero/app/pkg/controller/restore_controller.go:498" restore=openshift-adp/restore-csi-busybox-perf-single-1000-pods-rbd-rbd
      time="2023-05-21T20:07:08Z" level=error msg="Namespace busybox-perf-single-ns-1000-pods, resource restore error: error preparing persistentvolumeclaims/busybox-perf-single-ns-1000-pods/pvc-busybox-perf-single-ns-1000-pods-493: rpc error: code = Unknown desc = Failed to get Volumesnapshot busybox-perf-single-ns-1000-pods/velero-pvc-busybox-perf-single-ns-1000-pods-493-26lsk to restore PVC busybox-perf-single-ns-1000-pods/pvc-busybox-perf-single-ns-1000-pods-493: volumesnapshots.snapshot.storage.k8s.io \"velero-pvc-busybox-perf-single-ns-1000-pods-493-26lsk\" not found" logSource="/remote-source/velero/app/pkg/controller/restore_controller.go:498" restore=openshift-adp/restore-csi-busybox-perf-single-1000-pods-rbd-rbd
      

       

      dpa : 

       NAMESPACE       NAME             AGE
      openshift-adp   example-velero   4d4h
      [root@f07-h27-000-r640 ~]# oc describe  dpa  -A
      Name:         example-velero
      Namespace:    openshift-adp
      Labels:       <none>
      Annotations:  <none>
      API Version:  oadp.openshift.io/v1alpha1
      Kind:         DataProtectionApplication
      Metadata:
        Creation Timestamp:  2023-05-18T08:09:04Z
        Generation:          1
        Managed Fields:
          API Version:  oadp.openshift.io/v1alpha1
          Fields Type:  FieldsV1
          fieldsV1:
            f:spec:
              .:
              f:backupLocations:
              f:configuration:
                .:
                f:restic:
                  .:
                  f:enable:
                  f:podConfig:
                    .:
                    f:resourceAllocations:
                      .:
                      f:limits:
                        .:
                        f:cpu:
                        f:memory:
                      f:requests:
                        .:
                        f:cpu:
                        f:memory:
                  f:timeout:
                f:velero:
                  .:
                  f:defaultPlugins:
                  f:podConfig:
                    .:
                    f:resourceAllocations:
                      .:
                      f:limits:
                        .:
                        f:cpu:
                        f:memory:
                      f:requests:
                        .:
                        f:cpu:
                        f:memory:
          Manager:      OpenAPI-Generator
          Operation:    Update
          Time:         2023-05-18T08:09:04Z
          API Version:  oadp.openshift.io/v1alpha1
          Fields Type:  FieldsV1
          fieldsV1:
            f:status:
              .:
              f:conditions:
          Manager:         manager
          Operation:       Update
          Subresource:     status
          Time:            2023-05-18T08:09:04Z
        Resource Version:  275319354
        UID:               b8bb08b9-5fd2-42e9-9bf7-90d33fb1eaf0
      Spec:
        Backup Locations:
          Velero:
            Config:
              Insecure Skip TLS Verify:  true
              Profile:                   noobaa
              Region:                    noobaa
              s3ForcePathStyle:          true
              s3Url:                     https://s3-openshift-storage.apps.vlan608.rdu2.scalelab.redhat.com
            Credential:
              Key:    cloud
              Name:   cloud-credentials
            Default:  true
            Object Storage:
              Bucket:  oadp-bucket
              Prefix:  velero
            Provider:  aws
        Configuration:
          Restic:
            Enable:  true
            Pod Config:
              Resource Allocations:
                Limits:
                  Cpu:     2
                  Memory:  32768Mi
                Requests:
                  Cpu:     1
                  Memory:  16384Mi
            Timeout:       900m
          Velero:
            Default Plugins:
              openshift
              aws
              csi
            Pod Config:
              Resource Allocations:
                Limits:
                  Cpu:     4
                  Memory:  32768Mi
                Requests:
                  Cpu:     2
                  Memory:  16384Mi
      Status:
        Conditions:
          Last Transition Time:  2023-05-18T08:09:04Z
          Message:               Reconcile complete
          Reason:                Complete
          Status:                True
          Type:                  Reconciled
      Events:                    <none>
      

       

      backup Cr :  backup-csi-busybox-perf-single-1000-pods-rbd  - Pass  -     Completed 
      restore CR : restore-csi-busybox-perf-single-1000-pods-rbd  - Failed   - PartiallyFailed

      namesapce : busybox-perf-single-ns-1000-pods   

       

      Version-Release number of selected component (if applicable):

      OCP 4.11.7
      ODF 4.11.7
      OADP 1.2.0-69

      How reproducible:
      not always

        1. benchmark_runner.log
          60 kB
        2. cloud33_stage_velero-5bb5c9f8dc-pn4c8.log
          12.11 MB
        3. csi-cephfsplugin-provisioner-5cfbc4c8cf-htlw7.log
          17.93 MB
        4. csi-rbdplugin-provisioner-54b5dd8995-fzl76.log
          4.45 MB
        5. node-agent-46n98.log
          7 kB
        6. node-agent-79lgf.log
          7 kB
        7. node-agent-8fc2j.log
          7 kB
        8. node-agent-9rgjj.log
          9 kB
        9. node-agent-g2897.log
          7 kB
        10. node-agent-gmfbt.log
          7 kB
        11. node-agent-kksg2.log
          7 kB
        12. node-agent-lhzbj.log
          7 kB
        13. node-agent-prb28.log
          7 kB
        14. node-agent-s7j6p.log
          7 kB
        15. node-agent-w9ph6.log
          7 kB
        16. node-agent-xw94r.log
          7 kB
        17. oadp-cr.json
          1 kB
        18. oadp-report.json
          25 kB
        19. oadp-velero.log
          118.76 MB
        20. openshift-adp-controller-manager-7c55457f9-fxtgf.log
          7 kB
        21. velero-5848b67986-q82kx.log
          15.34 MB

            spampatt@redhat.com Shubham Pampattiwar
            tzahia Tzahi Ashkenazi
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: