Uploaded image for project: 'OpenShift API for Data Protection'
  1. OpenShift API for Data Protection
  2. OADP-726

Restore using "csi" shows "COMPLETED" , although the pods pvc stuck on pending , " waiting for a volume to be created"

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Minor Minor
    • None
    • OADP 1.1.0
    • csi-plugin, velero
    • False
    • Hide

      None

      Show
      None
    • False
    • ToDo
    • No
    • 0
    • 0
    • Very Likely
    • 0
    • None
    • Unset
    • Unknown

      Description of problem:

      running Restore using  "csi" shows "COMPLETED" , although the pods pvc  stuck on pending , " waiting for a volume to be created"

      i have tried more then one pods types , with data , and with apps   on sc : ocs-storagecluster-ceph-rbd
      both of them failed on  pvc claim step  after  the restore  shows  "completed"
      the error :  "waiting for a volume to be created, either by external provisioner "openshift-storage.rbd.csi.ceph.com" or manually created by system administrator"
      there were no issue with pvc on pod creation for both types on pods.
      attached the following files  : 
       * velero-5d9dcf486b-g7wnt.log
       * openshift-adp-controller-manager-5b859ccfc-m5c6d.log
       * ocs-storagecluster-rbdplugin-snapclass.txt
       * perf-project-1_events.log

      backup resource 

       apiVersion: velero.io/v1
      kind: Backup
      metadata:
        annotations:
          velero.io/source-cluster-k8s-gitversion: v1.23.5+012e945
          velero.io/source-cluster-k8s-major-version: "1"
          velero.io/source-cluster-k8s-minor-version: "23"
        creationTimestamp: "2022-08-17T11:48:08Z"
        generation: 8
        labels:
          velero.io/storage-location: example-velero-1
        name: latest-backup
        namespace: openshift-adp
        resourceVersion: "875684942"
        uid: 76a474ac-9768-4309-94d3-4ecc63a90dc7
      spec:
        csiSnapshotTimeout: 10m0s
        defaultVolumesToRestic: false
        hooks: {}
        includedNamespaces:
        - perf-project-1
        metadata: {}
        storageLocation: example-velero-1
        ttl: 720h0m0s
      status:
        completionTimestamp: "2022-08-17T11:49:09Z"
        csiVolumeSnapshotsAttempted: 4
        csiVolumeSnapshotsCompleted: 4
        expiration: "2022-09-16T11:48:08Z"
        formatVersion: 1.1.0
        phase: Completed
        progress:
          itemsBackedUp: 142
          totalItems: 142
        startTimestamp: "2022-08-17T11:48:08Z"
        version: 1
      

      restore resource 

       apiVersion: velero.io/v1
      kind: Restore
      metadata:
        creationTimestamp: "2022-08-17T11:53:50Z"
        generation: 8
        name: latest-restore
        namespace: openshift-adp
        resourceVersion: "875737797"
        uid: ce0f4e06-7f03-4715-9117-47ad7505a2f9
      spec:
        backupName: latest-backup
        excludedResources:
        - nodes
        - events
        - events.events.k8s.io
        - backups.velero.io
        - restores.velero.io
        - resticrepositories.velero.io
        hooks: {}
        includedNamespaces:
        - '*'
      status:
        completionTimestamp: "2022-08-17T11:54:10Z"
        phase: Completed
        progress:
          itemsRestored: 74
          totalItems: 74
        startTimestamp: "2022-08-17T11:53:50Z"
        warnings: 8
      

      all the 4 pods in error state after "completed "  restore 

      [root@f01-h07-000-r640 oadp-helpers]# oc get pods  -nperf-project-1
      NAME                  READY   STATUS   RESTARTS   AGE
      mariadb-1-deploy      0/1     Error    0          24m
      mongodb-1-deploy      0/1     Error    0          24m
      postgresql-1-deploy   0/1     Error    0          24m
      redis-1-deploy        0/1     Error    0          24m
       

       

      the error from the  target ns pvc 

       [root@f01-h07-000-r640 oadp-helpers]# oc get pvc -nperf-project-1
      NAME         STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS                  AGE
      mariadb      Pending                                      ocs-storagecluster-ceph-rbd   14m
      mongodb      Pending                                      ocs-storagecluster-ceph-rbd   14m
      postgresql   Pending                                      ocs-storagecluster-ceph-rbd   14m
      redis        Pending                                      ocs-storagecluster-ceph-rbd   14m
      

      events  output from one app that is pending for pvc :

      16m         Normal    Created                       pod/redis-1-deploy                       Created container deployment
      16m         Normal    Started                       pod/redis-1-deploy                       Started container deployment
      16m         Warning   FailedScheduling              pod/redis-1-srs76                        0/6 nodes are available: 6 pod has unbound immediate PersistentVolumeClaims.
      9m25s       Warning   FailedScheduling              pod/redis-1-srs76                        0/6 nodes are available: 6 pod has unbound immediate PersistentVolumeClaims.
      6m47s       Warning   FailedScheduling              pod/redis-1-srs76                        skip schedule deleting pod: perf-project-1/redis-1-srs76
      16m         Normal    SuccessfulCreate              replicationcontroller/redis-1            Created pod: redis-1-srs76
      6m48s       Normal    SuccessfulDelete              replicationcontroller/redis-1            Deleted pod: redis-1-srs76
      102s        Normal    ExternalProvisioning          persistentvolumeclaim/redis              waiting for a volume to be created, either by external provisioner "openshift-storage.rbd.csi.ceph.com" or manually created by system administrator
       

       

      Version-Release number of selected component (if applicable):

      tested and verify on oadp :

       * iib293185
       *  iib294611

      OCP : 4.10.23

      cloud15

            wnstb Wes Hayutin
            tzahia Tzahi Ashkenazi
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: