Uploaded image for project: 'OpenShift API for Data Protection'
  1. OpenShift API for Data Protection
  2. OADP-5095

[GCP-WIF] Backup fail when using fs or native DataMover

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Normal Normal
    • OADP 1.4.2
    • OADP 1.3.0, OADP 1.4.0
    • oadp-operator
    • 3
    • False
    • Hide

      None

      Show
      None
    • False
    • ToDo
    • 0
    • 0.000
    • Very Likely
    • 0
    • None
    • Unset
    • Unknown
    • None

      Description of problem:

       

      Version-Release number of selected component (if applicable):

      oadp-operator-bundle-container-1.3.0-117

      GCP cluster installed in manual mode with GCP Workload Identity configured

      How reproducible:

      100%

      Steps to Reproduce:
      1. Follow the instruction:
      https://github.com/kaovilai/oadp-operator/blob/d0529a790fc83737d27d2de1f8f1c08a1a79c17a/docs/design/gcp-wif-support_design.md

      2. Create DPA manifest which includes the dpa.spec.configuration.nodeAgent section :

      nodeAgent:
        enable: true
        uploaderType: kopia 

      3. Deploy Mysql application using the oadp-apps-deployer repository:
      https://gitlab.cee.redhat.com/app-mig/oadp-apps-deployer 

      appm deploy -f ocp-mysql -n a1 

      4. Create a backup for the mysql application namespace with:

      defaultVolumesToFsBackup: true 

      Actual results:

      Backup fail with the status:

      phase: PartiallyFailed 

      Expected results:

      Backup should successfully finish with the status:

      phase: Completed  

      Additional info:
      DPA:

      oc get dpa gcp-dpa -o yaml
      apiVersion: oadp.openshift.io/v1alpha1
      kind: DataProtectionApplication
      metadata:
        annotations:
          meta.helm.sh/release-name: gcp
          meta.helm.sh/release-namespace: openshift-adp
        creationTimestamp: "2023-10-12T11:51:01Z"
        generation: 1
        labels:
          app.kubernetes.io/managed-by: Helm
        name: gcp-dpa
        namespace: openshift-adp
        resourceVersion: "184074"
        uid: 9804a33a-eff5-4043-a09d-1432dbb4e087
      spec:
        backupLocations:
        - name: default
          velero:
            credential:
              key: service_account.json
              name: cloud-credentials
            default: true
            objectStorage:
              bucket: oadpbucket239331
              prefix: velero
            provider: gcp
        configuration:
          nodeAgent:
            enable: true
            uploaderType: kopia
          velero:
            defaultPlugins:
            - openshift
            - csi
            - gcp
      status:
        conditions:
        - lastTransitionTime: "2023-10-12T11:51:01Z"
          message: Reconcile complete
          reason: Complete
          status: "True"
          type: Reconciled 

      BSL:

      oc get bsl
      NAME      PHASE       LAST VALIDATED   AGE   DEFAULT
      default   Available   32s              23m   true 

      Pods:

      oc get pod
      NAME                                                READY   STATUS    RESTARTS   AGE
      node-agent-8psg9                                    1/1     Running   0          23m
      node-agent-kncd5                                    1/1     Running   0          23m
      node-agent-mmtts                                    1/1     Running   0          23m
      openshift-adp-controller-manager-6f8d96f99f-ltf7t   1/1     Running   0          5h28m
      velero-6cd76b5d74-sbxmt                             1/1     Running   0          23m 

      Backup:

      oc get backup b2 -o yaml  
      apiVersion: velero.io/v1
      kind: Backup
      metadata:
        annotations:
          meta.helm.sh/release-name: gcp
          meta.helm.sh/release-namespace: openshift-adp
          velero.io/resource-timeout: 10m0s
          velero.io/source-cluster-k8s-gitversion: v1.27.6+98158f9
          velero.io/source-cluster-k8s-major-version: "1"
          velero.io/source-cluster-k8s-minor-version: "27"
        creationTimestamp: "2023-10-12T12:09:13Z"
        generation: 6
        labels:
          app.kubernetes.io/managed-by: Helm
          velero.io/storage-location: default
        name: b2
        namespace: openshift-adp
        resourceVersion: "193443"
        uid: ba58c9b1-cddf-482f-8e19-c9e5759a22f2
      spec:
        csiSnapshotTimeout: 10m0s
        defaultVolumesToFsBackup: true
        includedNamespaces:
        - a1
        itemOperationTimeout: 4h0m0s
        storageLocation: default
        ttl: 720h0m0s
      status:
        completionTimestamp: "2023-10-12T12:09:20Z"
        errors: 1
        expiration: "2023-11-11T12:09:13Z"
        formatVersion: 1.1.0
        phase: PartiallyFailed
        progress:
          itemsBackedUp: 31
          totalItems: 31
        startTimestamp: "2023-10-12T12:09:13Z"
        version: 1 

      Velero logs:

      oc logs velero-6cd76b5d74-sbxmt| grep error
      Defaulted container "velero" out of: velero, openshift-velero-plugin (init), velero-plugin-for-csi (init), velero-plugin-for-gcp (init)
      time="2023-10-12T11:51:29Z" level=error msg="Current BackupStorageLocations available/unavailable/unknown: 0/0/1)" controller=backup-storage-location logSource="/remote-source/velero/app/pkg/controller/backup_storage_location_controller.go:194"
      time="2023-10-12T12:09:18Z" level=info msg="1 errors encountered backup up item" backup=openshift-adp/b2 logSource="/remote-source/velero/app/pkg/backup/backup.go:444" name=mysql-68d84d7c89-qhtp6
      time="2023-10-12T12:09:18Z" level=error msg="Error backing up item" backup=openshift-adp/b2 error="failed to wait BackupRepository: backup repository is not ready: error to init backup repo: error to connect to storage: unable to initialize token source: google.JWTConfigFromJSON: google: read JWT from JSON credentials: 'type' field is \"external_account\" (expected \"service_account\")" error.file="/remote-source/velero/app/pkg/repository/backup_repo_op.go:83" error.function=github.com/vmware-tanzu/velero/pkg/repository.GetBackupRepository logSource="/remote-source/velero/app/pkg/backup/backup.go:448" name=mysql-68d84d7c89-qhtp6
       

      BackupRepository:

      oc get backuprepository a1-default-kopia-jtlsd -o yaml
      apiVersion: velero.io/v1
      kind: BackupRepository
      metadata:
        creationTimestamp: "2023-10-12T12:09:18Z"
        generateName: a1-default-kopia-
        generation: 3
        labels:
          velero.io/repository-type: kopia
          velero.io/storage-location: default
          velero.io/volume-namespace: a1
        name: a1-default-kopia-jtlsd
        namespace: openshift-adp
        resourceVersion: "193420"
        uid: 7a6563f7-d650-4fb5-9d71-b577fb955bb8
      spec:
        backupStorageLocation: default
        maintenanceFrequency: 1h0m0s
        repositoryType: kopia
        resticIdentifier: gs:oadpbucket239331:/velero/restic/a1
        volumeNamespace: a1
      status:
        message: 'error to init backup repo: error to connect to storage: unable to initialize
          token source: google.JWTConfigFromJSON: google: read JWT from JSON credentials:
          ''type'' field is "external_account" (expected "service_account")'
        phase: NotReady 

              tkaovila@redhat.com Tiger Kaovilai
              wnstb Wes Hayutin
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: