-
Bug
-
Resolution: Done-Errata
-
Undefined
-
None
-
4.14
-
Important
-
No
-
False
-
-
This is a clone of issue OCPBUGS-19555. The following is the description of the original issue:
—
Description of problem:
Cluster backup consistently fails when a CGU is created with backup: true.
Version-Release number of selected component (if applicable):
TALM v4.14.0-62 OCP 4.14.0-rc.1
How reproducible:
Always
Steps to Reproduce:
1. Install hub cluster with OCP 4.14.0-rc.1 2. Install latest TALM on hub cluster 3. Provision managed cluster with OCP 4.14.0-rc.1 4. Create a CGU with backup: true 5. Enable CGU 6. CGU fails with backup status: UnrecoverableError 7. View backup agent pod logs on managedcluster
Actual results:
Backup fails
Expected results:
Backup Should succeed.
Additional info:
[kni@registry auth]$ oc logs -n openshift-talo-backup backup-agent-jnt9p --follow INFO[0002] Successfully remounted /host/sysroot with r/w permission INFO[0002] ------------------------------------------------------------ INFO[0002] Cleaning up old content... INFO[0002] ------------------------------------------------------------ INFO[0002] fullpath: /var/recovery/upgrade-recovery.sh INFO[0002] fullpath: /var/recovery/cluster INFO[0002] fullpath: /var/recovery/etc.exclude.list INFO[0002] fullpath: /var/recovery/etc INFO[0002] fullpath: /var/recovery/local INFO[0002] fullpath: /var/recovery/kubelet INFO[0025] fullpath: /var/recovery/extras.tgz INFO[0025] Old directories deleted with contents INFO[0025] Old contents have been cleaned up INFO[0031] Available disk space : 456.74 GiB; Estimated disk space required for backup: 32.28 GiB INFO[0031] Sufficient disk space found to trigger backup INFO[0031] Upgrade recovery script written INFO[0031] Running: bash -c /var/recovery/upgrade-recovery.sh --take-backup --dir /var/recovery INFO[0031] ##### Thu Sep 21 14:00:48 UTC 2023: Taking backup INFO[0031] ##### Thu Sep 21 14:00:48 UTC 2023: Wiping previous deployments and pinning active INFO[0031] error: Out of range deployment index 1, expected < 1 INFO[0031] Deployment 0 is already pinned INFO[0031] ##### Thu Sep 21 14:00:48 UTC 2023: Backing up container cluster and required files INFO[0031] Certificate /etc/kubernetes/static-pod-certs/configmaps/etcd-serving-ca/ca-bundle.crt is missing. Checking in different directory INFO[0031] Certificate /etc/kubernetes/static-pod-resources/etcd-certs/configmaps/etcd-serving-ca/ca-bundle.crt found! INFO[0031] found latest kube-apiserver: /etc/kubernetes/static-pod-resources/kube-apiserver-pod-9 INFO[0031] found latest kube-controller-manager: /etc/kubernetes/static-pod-resources/kube-controller-manager-pod-6 INFO[0031] found latest kube-scheduler: /etc/kubernetes/static-pod-resources/kube-scheduler-pod-6 INFO[0031] found latest etcd: /etc/kubernetes/static-pod-resources/etcd-pod-2 INFO[0031] etcdctl is already installed INFO[0031] etcdutl is already installed INFO[0031] {"level":"info","ts":"2023-09-21T14:00:48.48003Z","caller":"snapshot/v3_snapshot.go:65","msg":"created temporary db file","path":"/var/recovery/cluster/snapshot_2023-09-21_140048__POSSIBLY_DIRTY__.db.part"} INFO[0031] {"level":"info","ts":"2023-09-21T14:00:48.490246Z","logger":"client","caller":"v3@v3.5.9/maintenance.go:212","msg":"opened snapshot stream; downloading"} INFO[0031] {"level":"info","ts":"2023-09-21T14:00:48.49028Z","caller":"snapshot/v3_snapshot.go:73","msg":"fetching snapshot","endpoint":"https://10.46.46.66:2379"} INFO[0033] {"level":"info","ts":"2023-09-21T14:00:50.158759Z","logger":"client","caller":"v3@v3.5.9/maintenance.go:220","msg":"completed snapshot read; closing"} INFO[0033] {"level":"info","ts":"2023-09-21T14:00:50.407955Z","caller":"snapshot/v3_snapshot.go:88","msg":"fetched snapshot","endpoint":"https://10.46.46.66:2379","size":"115 MB","took":"1 second ago"} INFO[0033] {"level":"info","ts":"2023-09-21T14:00:50.408049Z","caller":"snapshot/v3_snapshot.go:97","msg":"saved","path":"/var/recovery/cluster/snapshot_2023-09-21_140048__POSSIBLY_DIRTY__.db"} INFO[0033] Snapshot saved at /var/recovery/cluster/snapshot_2023-09-21_140048__POSSIBLY_DIRTY__.db INFO[0033] {"hash":1281395486,"revision":693323,"totalKey":7404,"totalSize":115171328} INFO[0033] snapshot db and kube resources are successfully saved to /var/recovery/cluster INFO[0034] Command succeeded: cp -Ra /etc/ /var/recovery/ INFO[0034] Command succeeded: cp -Ra /usr/local/ /var/recovery/ INFO[0099] Command succeeded: cp -Ra /var/lib/kubelet/ /var/recovery/ INFO[0099] tar: Removing leading `/' from member names INFO[0099] tar: /var/lib/ovn-ic/etc/enable_dynamic_cpu_affinity: Cannot stat: No such file or directory INFO[0099] tar: Exiting with failure status due to previous errors INFO[0099] ##### Thu Sep 21 14:01:55 UTC 2023: Failed to backup additional managed files ERRO[0099] exit status 1 Error: exit status 1 Usage: upgrade-recovery launchBackup [flags]Flags: -h, --help help for launchBackup exit status 1
- clones
-
OCPBUGS-19555 Cluster Backup Fails in upgrade-recovery.sh
- Closed
- is blocked by
-
OCPBUGS-19555 Cluster Backup Fails in upgrade-recovery.sh
- Closed
- links to
-
RHEA-2023:112754 OpenShift Container Platform 4.14.0 CNF vRAN extras update
- mentioned on