Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Major
Fix Version/s: OADP 1.2.0
Affects Version/s: OADP 1.2.0
Component/s: velero
Labels:
- QE
- triaged

Activity Type:
Quality / Stability / Reliability
Workstream:

None
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
QEStatus:
ToDo
Intelligence Requested:
Market:

WSJF:
0
Risk Probability:
Very Likely
Risk Score:
0

Root Cause:
Unset
Failure Category:
Unknown

Regression:
No

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

Description of problem:

While running CSI backup (1000pods), velero server was restarted.

The backup CR failed with error "failureReason": "found a backup with status \"InProgress\" during the server starting, mark it as \"Failed\""

Version-Release number of selected component (if applicable):

OCP 4.12.9, ODF 4.12.2, OADP 1.2.0-48

Using CephRBD

How reproducible:

Steps to Reproduce:
1. Create ns with 1000pods (busybox pods)
2. Running CSI backup
3. Verify Backup status

Actual results:

Velero was restarted during the backup.

Backup failed

Expected results:

No Velero restart and backup completed

Additional info:

Attached logs & DPA config

Velero log:

panic: reflect: slice index out of range [recovered]
panic: reflect: slice index out of range [recovered]
panic: reflect: slice index out of range

goroutine 2007 [running]:
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile.func1()
/remote-source/velero/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.2/pkg/internal/controller/controller.go:118 +0x1f4
panic({0x20f9ae0, 0x2adc450})
/usr/lib/golang/src/runtime/panic.go:884 +0x212
encoding/json.(*encodeState).marshal.func1()
/usr/lib/golang/src/encoding/json/encode.go:327 +0x6e

level=error msg="fail to recreate VolumeSnapshotContent snapcontent-72bf9226-6b39-4a32-ac6d-65d8add49ef7: fail to retrieve VolumeSnapshotContent snapcontent-72bf9226-6b39-4a32-ac6d-65d8add49ef7 info: timed out waiting for the condition" backup=openshift-adp/csi-backup-rbd-1000pvs-iter3 controller=backup-finalizer logSource="/remote-source/velero/app/pkg/controller/backup_controller.go:1073

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

velero-restart.tar
27.66 MB
2023/05/01 9:04 AM

Assignee:: Shubham Pampattiwar

Reporter:: David Vaanunu

QA Contact:: David Vaanunu

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Created:: 2023/05/01 9:04 AM

Updated:: 2025/08/08 9:11 PM

Resolved:: 2023/05/11 10:09 AM

Details

Description

Description of problem:

While running CSI backup (1000pods), velero server was restarted.

OCP 4.12.9, ODF 4.12.2, OADP 1.2.0-48

Actual results:

Velero was restarted during the backup.

No Velero restart and backup completed

Attachments

Attachments

Easy Agile Planning Poker

Activity

People

Dates