Uploaded image for project: 'OpenShift API for Data Protection'
  1. OpenShift API for Data Protection
  2. OADP-5354

BR 5.0.3 | AWS STS env | EFS+EBS + Offline OADP | Backup is failing on validation PartiallyFailed

XMLWordPrintable

    • 3
    • False
    • Hide

      None

      Show
      None
    • False
    • ToDo
    • 0
    • 0.000
    • Very Likely
    • 0
    • Customer Escalated, Customer Facing
    • None
    • Unset
    • Unknown
    • None

      Epic Goal

      Client Ally Financial requires Backup and Restore in order to continue with their modernization of DataStage from Dev to Prod. They are blocked from going live and deploying to production because they cannot backup and restore. 

       

      Description

      Problem Description:
      Trying to take the Offline Backup with OADP 1.4 + KOPIA on AWS STS env as per OCP document
      cpd-cli oadp backup list | grep sts
      q833-sts-1-tenant-offline-b1 PartiallyFailed 4 0 2024-12-06 13:33:49 -0800 PST 364d sr-br-rosa-dpa-1 <none>
       
      The backup alone was showing completed

       

        • PHASE [BACKUP CREATE/COMPLETED] ********************************************
          backup took 32m15.772674805s

      backup command completed
      post-backup hooks are invoked, but applications may take longer to come up.
      please use 'oc get po', 'oc get sts', 'oc get deploy' to check workload status.

        • PHASE [BACKUP CREATE/END] **************************************************
        • INFO [BACKUP CREATE/SUMMARY/START] *****************************************

      --------------------------------------------------------------------------------

      Scenario: BACKUP CREATE (q833-sts-1-tenant-offline-b1)
      Start Time: 2024-12-06 13:22:04.039121743 -0800 PST m=+2.798662948
      Completion Time: 2024-12-06 13:54:19.81188798 -0800 PST m=+1938.571429173
      Time Elapsed: 32m15.772766225s

      --------------------------------------------------------------------------------

        • INFO [BACKUP CREATE/SUMMARY/END] *******************************************
          Backup was not Completed
           
          Release/Build Number: ###
          cpd-cli version: cpd-cli version
          cpd-cli oadp version: cpd-cli oadp version
          case-repo-location url (snapshot#):
          olm-utils-image tag:

      Steps to reproduce:

      1. Install OADP 1.4 with role ARN
      2. Install DPA
        dpa
      Details

      ``` apiVersion: v1 items: - apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: creationTimestamp: "2024-12-06T02:19:08Z" generation: 5 name: sr-br-rosa-dpa namespace: openshift-adp resourceVersion: "7274091" uid: 55118a03-9bcf-4e68-a650-0fd1feab00dd spec: backupImages: true backupLocations: - bucket: cloudStorageRef: name: sr-br-rosa-oadp config: region: eu-central-1 credential: key: credentials name: cloud-credentials default: true prefix: velero configuration: nodeAgent: enable: true uploaderType: kopia velero: customPlugins: - image: icr.io/cpopen/cpd/cpdbr-velero-plugin:5.0.3-x86_64 name: cpdbr-velero-plugin defaultPlugins: - openshift - aws snapshotLocations: - velero: config: credentialsFile: /tmp/sr-br-rosa/oadp/credentials enableSharedConfig: "true" profile: default region: eu-central-1 provider: aws status: conditions: - lastTransitionTime: "2024-12-06T02:27:08Z" message: Reconcile complete reason: Complete status: "True" type: Reconciled kind: List metadata: resourceVersion: "" selfLink: "" ```

      1. Configure cpd-cli oadp
      2. Start the backup
      3. Validate the backup

      Actual result:
      Failed
       
      Expected result:
      Success
       
      Cluster Credentials (if possible):
      Cluster is STS enabled
      [root@awsmanaged1011 installer-files]# ./rosa list clusters
      ID NAME STATE TOPOLOGY
      2fc8986iol46d6ghr155dg615j4j6bgt sr-br-rosa ready Classic (STS)
       
      Access to the Cluster

      Reach out if cluster credentials are necessary

       

      Additional info or Screenshots: (log files, diagnostics, system/env information etc, use the tool mentioned below)

      Refer the url and upload the necessary logs
      q833-sts-1-tenant-offline-b1.log

      @arie-pratama-s was looking into the issue with me and pointed out the problem to be in BackupRepository - the credentials are expiring
      status: message: 'error to get repo options: error to get repo credentials: error get
      s3 credentials: failed to refresh cached credentials, failed to retrieve credentials,
      operation error STS: AssumeRoleWithWebIdentity, https response error StatusCode: 0, RequestID: , request send failed, Post "https://sts/..amazonaws.com/": dial
      tcp: lookup sts..amazonaws.com: no such host'
      phase: NotReady

              wnstb Wes Hayutin
              adrilee315 Adrian Lee
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated: