Uploaded image for project: 'OpenShift API for Data Protection'
  1. OpenShift API for Data Protection
  2. OADP-4149

Backup of XFS volume is failing, when the volume is filled by 100%

XMLWordPrintable

    • False
    • Hide

      None

      Show
      None
    • False
    • ToDo
    • Important
    • 8
    • 0
    • Very Likely
    • 0
    • Customer Escalated, Customer Facing
    • None
    • Unset
    • Unknown
    • No

      Description of problem:

      When using Data Mover as per Backing up and restoring CSI snapshots data movement to backup persistent data, it's observed that the Backup will fail, when the volume formatted using XFS filesystem type and the volume is filled by 100%.

      The backup pod in openshift-adp will fail to start and report the below Events. Interesting enough, a volume with filesystem type set to EXT4 and also filled by 100% is not showing/reporting similar problems.

      11m         Warning   Failed                                     pod/backup-8-gnkw2                             Error: relabel failed /var/lib/kubelet/pods/3acf031e-aa17-49af-ae82-2f622de2fe34/volumes/kubernetes.io~csi/pvc-684b9cc5-df89-4043-a7c4-b7b37d56512c/mount: lsetxattr /var/lib/kubelet/pods/3acf031e-aa17-49af-ae82-2f622de2fe34/volumes/kubernetes.io~csi/pvc-684b9cc5-df89-4043-a7c4-b7b37d56512c/mount/data-xfs-103: no space left on device
      9m59s       Warning   Failed                                     pod/backup-8-gnkw2                             Error: relabel failed /var/lib/kubelet/pods/3acf031e-aa17-49af-ae82-2f622de2fe34/volumes/kubernetes.io~csi/pvc-684b9cc5-df89-4043-a7c4-b7b37d56512c/mount: lsetxattr /var/lib/kubelet/pods/3acf031e-aa17-49af-ae82-2f622de2fe34/volumes/kubernetes.io~csi/pvc-684b9cc5-df89-4043-a7c4-b7b37d56512c/mount/data-xfs-10: no space left on device
      9m22s       Warning   Failed                                     pod/backup-8-gnkw2                             Error: relabel failed /var/lib/kubelet/pods/3acf031e-aa17-49af-ae82-2f622de2fe34/volumes/kubernetes.io~csi/pvc-684b9cc5-df89-4043-a7c4-b7b37d56512c/mount: lsetxattr /var/lib/kubelet/pods/3acf031e-aa17-49af-ae82-2f622de2fe34/volumes/kubernetes.io~csi/pvc-684b9cc5-df89-4043-a7c4-b7b37d56512c/mount/data-xfs-105: no space left on device
      11m         Warning   Failed                                     pod/backup-8-gnkw2                             Error: relabel failed /var/lib/kubelet/pods/3acf031e-aa17-49af-ae82-2f622de2fe34/volumes/kubernetes.io~csi/pvc-684b9cc5-df89-4043-a7c4-b7b37d56512c/mount: lsetxattr /var/lib/kubelet/pods/3acf031e-aa17-49af-ae82-2f622de2fe34/volumes/kubernetes.io~csi/pvc-684b9cc5-df89-4043-a7c4-b7b37d56512c/mount/data-xfs-109: no space left on device
      10m         Warning   Failed                                     pod/backup-8-gnkw2                             Error: relabel failed /var/lib/kubelet/pods/3acf031e-aa17-49af-ae82-2f622de2fe34/volumes/kubernetes.io~csi/pvc-684b9cc5-df89-4043-a7c4-b7b37d56512c/mount: lsetxattr /var/lib/kubelet/pods/3acf031e-aa17-49af-ae82-2f622de2fe34/volumes/kubernetes.io~csi/pvc-684b9cc5-df89-4043-a7c4-b7b37d56512c/mount/data-xfs-1: no space left on device
      10m         Warning   Failed                                     pod/backup-8-gnkw2                             Error: relabel failed /var/lib/kubelet/pods/3acf031e-aa17-49af-ae82-2f622de2fe34/volumes/kubernetes.io~csi/pvc-684b9cc5-df89-4043-a7c4-b7b37d56512c/mount: lsetxattr /var/lib/kubelet/pods/3acf031e-aa17-49af-ae82-2f622de2fe34/volumes/kubernetes.io~csi/pvc-684b9cc5-df89-4043-a7c4-b7b37d56512c/mount/data-xfs-102: no space left on device
      9m46s       Warning   Failed                                     pod/backup-8-gnkw2                             Error: relabel failed /var/lib/kubelet/pods/3acf031e-aa17-49af-ae82-2f622de2fe34/volumes/kubernetes.io~csi/pvc-684b9cc5-df89-4043-a7c4-b7b37d56512c/mount: lsetxattr /var/lib/kubelet/pods/3acf031e-aa17-49af-ae82-2f622de2fe34/volumes/kubernetes.io~csi/pvc-684b9cc5-df89-4043-a7c4-b7b37d56512c/mount/data-xfs-101: no space left on device
      

      Version-Release number of selected component (if applicable):

      OpenShift API for Data Protection 1.3.1

      How reproducible:

      Always

      Steps to Reproduce:

      1. Setup OpenShift Container Platform 4 and install OpenShift API for Data Protection 1.3.1
      2. Configure ... as per example below (the example is using minio but it's expected to see the same problem with any other storage location)

      apiVersion: oadp.openshift.io/v1alpha1
      kind: DataProtectionApplication
      metadata:
        name: oadp-minio
        namespace: openshift-adp
      spec:
        backupLocations:
        - velero:
            config:
              insecureSkipTLSVerify: "true"
              profile: default
              region: minio
              s3ForcePathStyle: "true"
              s3Url: https://<svc-ip>
            credential:
              key: cloud
              name: cloud-credentials
            default: true
            objectStorage:
              bucket: backup
              prefix: velero
            provider: aws
        configuration:
          nodeAgent:
            enable: true
            uploaderType: kopia
          velero:
            defaultPlugins:
            - openshift
            - csi
            - aws
        snapshotLocations:
        - velero:
            config:
              profile: default
              region: minio
            provider: aws
      

      3. Create a custom StorageClass to have the volume formated with XFS filesystem

      allowVolumeExpansion: true
      apiVersion: storage.k8s.io/v1
      kind: StorageClass
      metadata:
        name: gp3-csi-custom
      parameters:
        csi.storage.k8s.io/fstype: xfs
        encrypted: "true"
        type: gp3
      provisioner: ebs.csi.aws.com
      reclaimPolicy: Delete
      volumeBindingMode: WaitForFirstConsumer
      

      4. Create an application with filesystem attached (XFS)

      apiVersion: apps/v1
      kind: Deployment
      metadata:
        labels:
          app: curl
          app.kubernetes.io/component: curl
          app.kubernetes.io/instance: curl
          app.kubernetes.io/name: curl
          app.kubernetes.io/part-of: curl
          app.openshift.io/runtime: fedora
          app.openshift.io/runtime-namespace: project-100
        name: curl
        namespace: project-100
      spec:
        progressDeadlineSeconds: 600
        replicas: 1
        revisionHistoryLimit: 10
        selector:
          matchLabels:
            app: curl
        strategy:
          rollingUpdate:
            maxSurge: 25%
            maxUnavailable: 25%
          type: RollingUpdate
        template:
          metadata:
            annotations:
              openshift.io/generated-by: OpenShiftWebConsole
            creationTimestamp: null
            labels:
              app: curl
              deployment: curl
          spec:
            containers:
            - image: quay.io/rhn_support_sreber/curl:latest
              imagePullPolicy: Always
              name: curl
              ports:
              - containerPort: 8080
                protocol: TCP
              resources: {}
              terminationMessagePath: /dev/termination-log
              terminationMessagePolicy: File
              volumeMounts:
              - mountPath: /data
                name: data
              - mountPath: /data-xfs
                name: data-xfs
            dnsPolicy: ClusterFirst
            restartPolicy: Always
            schedulerName: default-scheduler
            securityContext: {}
            terminationGracePeriodSeconds: 30
            volumes:
            - name: data
              persistentVolumeClaim:
                claimName: data
            - name: data-xfs
              persistentVolumeClaim:
                claimName: data-xfs
      

      5. Write to the XFS volume in the application to fill it up

      oc rsh -n project-100 curl-XXXXXXXX-XXXX
      for i in {1..100000000}
      do
      dd if=/dev/random of=/data-xfs/data-xfs-$i bs=1024K count=1
      done
      

      6. Run the backup

      apiVersion: velero.io/v1
      kind: Backup
      metadata:
        name: backup
        labels:
          velero.io/storage-location: oadp-minio-1
        namespace: openshift-adp
      spec:
        hooks: {}
        defaultVolumesToFsBackup: false
        snapshotMoveData: true
        includedNamespaces:
        - project-100
        storageLocation: oadp-minio-1
        ttl: 720h0m0s
        volumeSnapshotLocations:
        - oadp-minio-1
      

      Actual results:

      Backup remains pending completion until the timeout is met. Yet the following events are reported by the backup pod in openshift-adp

      11m         Warning   Failed                                     pod/backup-8-gnkw2                             Error: relabel failed /var/lib/kubelet/pods/3acf031e-aa17-49af-ae82-2f622de2fe34/volumes/kubernetes.io~csi/pvc-684b9cc5-df89-4043-a7c4-b7b37d56512c/mount: lsetxattr /var/lib/kubelet/pods/3acf031e-aa17-49af-ae82-2f622de2fe34/volumes/kubernetes.io~csi/pvc-684b9cc5-df89-4043-a7c4-b7b37d56512c/mount/data-xfs-103: no space left on device
      9m59s       Warning   Failed                                     pod/backup-8-gnkw2                             Error: relabel failed /var/lib/kubelet/pods/3acf031e-aa17-49af-ae82-2f622de2fe34/volumes/kubernetes.io~csi/pvc-684b9cc5-df89-4043-a7c4-b7b37d56512c/mount: lsetxattr /var/lib/kubelet/pods/3acf031e-aa17-49af-ae82-2f622de2fe34/volumes/kubernetes.io~csi/pvc-684b9cc5-df89-4043-a7c4-b7b37d56512c/mount/data-xfs-10: no space left on device
      9m22s       Warning   Failed                                     pod/backup-8-gnkw2                             Error: relabel failed /var/lib/kubelet/pods/3acf031e-aa17-49af-ae82-2f622de2fe34/volumes/kubernetes.io~csi/pvc-684b9cc5-df89-4043-a7c4-b7b37d56512c/mount: lsetxattr /var/lib/kubelet/pods/3acf031e-aa17-49af-ae82-2f622de2fe34/volumes/kubernetes.io~csi/pvc-684b9cc5-df89-4043-a7c4-b7b37d56512c/mount/data-xfs-105: no space left on device
      11m         Warning   Failed                                     pod/backup-8-gnkw2                             Error: relabel failed /var/lib/kubelet/pods/3acf031e-aa17-49af-ae82-2f622de2fe34/volumes/kubernetes.io~csi/pvc-684b9cc5-df89-4043-a7c4-b7b37d56512c/mount: lsetxattr /var/lib/kubelet/pods/3acf031e-aa17-49af-ae82-2f622de2fe34/volumes/kubernetes.io~csi/pvc-684b9cc5-df89-4043-a7c4-b7b37d56512c/mount/data-xfs-109: no space left on device
      10m         Warning   Failed                                     pod/backup-8-gnkw2                             Error: relabel failed /var/lib/kubelet/pods/3acf031e-aa17-49af-ae82-2f622de2fe34/volumes/kubernetes.io~csi/pvc-684b9cc5-df89-4043-a7c4-b7b37d56512c/mount: lsetxattr /var/lib/kubelet/pods/3acf031e-aa17-49af-ae82-2f622de2fe34/volumes/kubernetes.io~csi/pvc-684b9cc5-df89-4043-a7c4-b7b37d56512c/mount/data-xfs-1: no space left on device
      10m         Warning   Failed                                     pod/backup-8-gnkw2                             Error: relabel failed /var/lib/kubelet/pods/3acf031e-aa17-49af-ae82-2f622de2fe34/volumes/kubernetes.io~csi/pvc-684b9cc5-df89-4043-a7c4-b7b37d56512c/mount: lsetxattr /var/lib/kubelet/pods/3acf031e-aa17-49af-ae82-2f622de2fe34/volumes/kubernetes.io~csi/pvc-684b9cc5-df89-4043-a7c4-b7b37d56512c/mount/data-xfs-102: no space left on device
      9m46s       Warning   Failed                                     pod/backup-8-gnkw2                             Error: relabel failed /var/lib/kubelet/pods/3acf031e-aa17-49af-ae82-2f622de2fe34/volumes/kubernetes.io~csi/pvc-684b9cc5-df89-4043-a7c4-b7b37d56512c/mount: lsetxattr /var/lib/kubelet/pods/3acf031e-aa17-49af-ae82-2f622de2fe34/volumes/kubernetes.io~csi/pvc-684b9cc5-df89-4043-a7c4-b7b37d56512c/mount/data-xfs-101: no space left on device
      

      Expected results:

      The backup should either complete or immediately report that the backup can not be taken because the volume is filled by 100% and therefore the process might actually fail.

      Additional info:

      As highlighted before, this only happens with XFS filesystem. When using EXT4 no such problem is being reported

              rhn-engineering-mpryc Michal Pryc
              rhn-support-sreber Simon Reber
              Wes Hayutin
              Prasad Joshi Prasad Joshi
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: