-
Bug
-
Resolution: Unresolved
-
Major
-
OADP 1.2.0
-
False
-
-
False
-
ToDo
-
No
-
-
-
0
-
0
-
Very Likely
-
0
-
None
-
Unset
-
Unknown
-
Proposed
Description of problem:
Deploy OADP stage version, running datamover backup & restore.
While running backup&restore using Ceph-RBD - tests ended with 'Completed' status.
Trying to run a backup using CephFS - Part of the "volsync-src-vsb" pods were with error status.
- oc logs volsync-src-vsb-br9vl-rep-src-npgbp
Starting container
VolSync restic container version: ACM-0.7.1-8af0bf0
backup
restic 0.15.1 compiled with go1.19.6 on linux/amd64
Testing mandatory env variables
== Checking directory for content ===
ls: cannot open directory '/data': Permission denied
- oc describe pod volsync-src-vsb-br9vl-rep-src-npgbp
Containers:
restic:
Container ID: cri-o://73b24708dd2137b9ac75906d4b314c24ee69ce0c0232cf4096434d474b9e26b8
Image: registry.redhat.io/rhacm2/volsync-rhel8@sha256:7207ea4de4a8bb3a2930b974c2122215cb902ab577e4ef1de6e635fd854b6d0a
Image ID: registry.redhat.io/rhacm2/volsync-rhel8@sha256:7207ea4de4a8bb3a2930b974c2122215cb902ab577e4ef1de6e635fd854b6d0a
Port: <none>
Host Port: <none>
Command:
/mover-restic/entry.sh
Args:
backup
State: Terminated
Reason: Error
Exit Code: 2
Started: Thu, 08 Jun 2023 12:36:44 +0000
Finished: Thu, 08 Jun 2023 12:36:44 +0000
Ready: False
Restart Count: 0
Environment:
FORGET_OPTIONS: --keep-last 1
DATA_DIR: /data
RESTIC_CACHE_DIR: /cache
RESTORE_AS_OF:
SELECT_PREVIOUS: 0
Mounts:
/cache from cache (rw)
/data from data (rw)
/tmp from tempdir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-sz4jb (ro)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m3s default-scheduler Successfully assigned openshift-adp/volsync-src-vsb-br9vl-rep-src-npgbp to worker003-r640
Warning FileSystemResizeFailed 2m3s kubelet MountVolume.NodeExpandVolume failed for volume "pvc-04c18c0d-b4cc-49dc-8868-59b83d27fac0" requested read-only file system
Normal SuccessfulAttachVolume 2m3s attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-f0e153b3-0c42-4373-9922-e5823cb787c9"
Normal AddedInterface 115s multus Add eth0 [10.128.5.186/23] from openshift-sdn
Normal Pulled 115s kubelet Container image "registry.redhat.io/rhacm2/volsync-rhel8@sha256:7207ea4de4a8bb3a2930b974c2122215cb902ab577e4ef1de6e635fd854b6d0a" already present on machine
Normal Created 115s kubelet Created container restic
Normal Started 115s kubelet Started container restic
[root@f01-h07-000-r640 backup]# oc get pods
NAME READY STATUS RESTARTS AGE
openshift-adp-controller-manager-74bb4d7cd6-7xlgw 1/1 Running 0 29m
velero-7ff77489bf-qkh9h 1/1 Running 0 27m
volsync-src-vsb-2xx72-rep-src-67z29 0/1 Error 0 4m21s
volsync-src-vsb-2xx72-rep-src-8kvj6 0/1 Error 0 4m39s
volsync-src-vsb-2xx72-rep-src-b2mp4 0/1 Error 0 2m59s
volsync-src-vsb-2xx72-rep-src-mqpc6 0/1 Error 0 88s
volsync-src-vsb-2xx72-rep-src-nk6ct 0/1 Error 0 4m48s
volsync-src-vsb-2xx72-rep-src-rpzlq 0/1 Error 0 3m47s
volsync-src-vsb-5m94b-rep-src-fxcvg 0/1 Error 0 3m
volsync-src-vsb-5m94b-rep-src-mnsvj 0/1 Error 0 3m47s
volsync-src-vsb-5m94b-rep-src-q4t22 0/1 Error 0 4m48s
volsync-src-vsb-5m94b-rep-src-rf9tk 0/1 Error 0 4m39s
Version-Release number of selected component (if applicable):
OCP 4.1.2.9
ODF 4.12.3
OADP 1.2.0 Stage
Ceph-FS
How reproducible:
Steps to Reproduce:
1. Create NS with 100pods (6G PV size , 2G usage) - over CephFS
2. Running datamover backup flow
3. Monitor adp namespace pods
4. Pods are in 'Error' status
Actual results:
Backup failed. Part of volsync-src-vsb pods are 'Error' status
Expected results:
Backup completed
Additional info:
~50% of volsync-src-vsb pods were succeed.
try with OADP 1.2.0 builds 78 & 79 - Backup with CephFS completed OK