-
Sub-task
-
Resolution: Done
-
Undefined
-
None
-
None
-
None
-
Quality / Stability / Reliability
-
False
-
-
False
-
ToDo
-
-
-
Very Likely
-
0
-
None
-
Unset
-
Unknown
Description of problem:
On configuring the repository maintenance pod to run on a specific node using labels in DPA config, it does not respect the configurations and only run on the node where the application pod is running.
Also the resource requests and limits are being ignored.
Version-Release number of selected component (if applicable):
1.5.0
How reproducible:
Always
Steps to Reproduce:
1. Add appropriate labels to nodes.
2. Configure affinity and resource settings in the DPA config using spec.configuration.repositoryMaintenance.global
Actual results:
Maintenance job runs on wrong node.
oc get dpa -o yaml
apiVersion: v1
items:
- apiVersion: oadp.openshift.io/v1alpha1
kind: DataProtectionApplication
metadata:
creationTimestamp: "2025-05-21T08:16:15Z"
generation: 4
name: ts-dpa
namespace: openshift-adp
resourceVersion: "71587"
uid: 99ae2b16-d8c6-48ad-bc63-2e5ed89fbb16
spec:
backupLocations:
- velero:
config:
region: us-east-2
credential:
key: cloud
name: cloud-credentials
default: true
objectStorage:
bucket: oadp119731dh2k8
prefix: velero
provider: aws
configuration:
nodeAgent:
enable: true
uploaderType: kopia
repositoryMaintenance:
global:
loadAffinity:
- nodeSelector:
matchExpressions:
- key: label.io/location
operator: In
values:
- EU
matchLabels:
label.io/gpu: "no"
podResources:
cpuLimit: 200m
cpuRequest: 100m
memoryLimit: 200Mi
memoryRequest: 100Mi
velero:
defaultPlugins:
- csi
- aws
- openshift
disableFsBackup: false
logFormat: text
status:
conditions:
- lastTransitionTime: "2025-05-21T08:16:15Z"
message: Reconcile complete
reason: Complete
status: "True"
type: Reconciled
kind: List
metadata:
resourceVersion: ""
This is the only node satisfying the 2 labels.
oc get nodes --show-labels | grep gpu | grep EU ip-10-0-91-209.us-east-2.compute.internal Ready worker 3h58m v1.32.3 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=m6i.xlarge,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-east-2,failure-domain.beta.kubernetes.io/zone=us-east-2c,kubernetes.io/arch=amd64,kubernetes.io/hostname=ip-10-0-91-209.us-east-2.compute.internal,kubernetes.io/os=linux,label.io/gpu=no,label.io/location=EU,machine.openshift.io/interruptible-instance=,node-role.kubernetes.io/worker=,node.kubernetes.io/instance-type=m6i.xlarge,node.openshift.io/os_id=rhcos,topology.ebs.csi.aws.com/zone=us-east-2c,topology.k8s.aws/zone-id=use2-az3,topology.kubernetes.io/region=us-east-2,topology.kubernetes.io/zone=us-east-2c
oc get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES mysql-ts-dpa-1-kopia-maintain-job-1747821617594-vcttx 0/1 Completed 0 14m 10.131.0.50 ip-10-0-14-96.us-east-2.compute.internal <none> <none> node-agent-hrb6b 1/1 Running 0 118m 10.128.2.19 ip-10-0-91-209.us-east-2.compute.internal <none> <none> node-agent-nxccd 1/1 Running 0 118m 10.129.2.16 ip-10-0-60-216.us-east-2.compute.internal <none> <none> node-agent-p6gp7 1/1 Running 0 118m 10.131.0.41 ip-10-0-14-96.us-east-2.compute.internal <none> <none> openshift-adp-controller-manager-788c8c458b-bj9s5 1/1 Running 0 140m 10.128.2.18 ip-10-0-91-209.us-east-2.compute.internal <none> <none> velero-56b949b7b4-qpwjg 1/1 Running 0 109m 10.129.2.17 ip-10-0-60-216.us-east-2.compute.internal <none> <none>
Backuprepository
oc get backuprepository -o yaml
apiVersion: v1
items:
- apiVersion: velero.io/v1
kind: BackupRepository
metadata:
creationTimestamp: "2025-05-21T08:59:16Z"
generation: 4
labels:
velero.io/repository-type: kopia
velero.io/storage-location: ts-dpa-1
velero.io/volume-namespace: mysql
name: mysql-ts-dpa-1-kopia
namespace: openshift-adp
resourceVersion: "87330"
uid: 2fe358bc-717a-4ec2-a831-2f09e2f21e0d
spec:
backupStorageLocation: ts-dpa-1
maintenanceFrequency: 1h0m0s
repositoryType: kopia
resticIdentifier: s3:s3-us-east-2.amazonaws.com/oadp119731dh2k8/velero/restic/mysql
volumeNamespace: mysql
status:
lastMaintenanceTime: "2025-05-21T10:00:22Z"
phase: Ready
recentMaintenance:
- completeTimestamp: "2025-05-21T10:00:22Z"
result: Succeeded
startTimestamp: "2025-05-21T10:00:17Z"
kind: List
metadata:
resourceVersion: ""
ConfigMap:
oc get cm repository-maintenance-ts-dpa -o yaml apiVersion: v1 data: repository-maintenance-config: '{"global":{"loadAffinity":[{"nodeSelector":{"matchLabels":{"label.io/gpu":"no"},"matchExpressions":[{"key":"label.io/location","operator":"In","values":["EU"]}]}}],"podResources":{"cpuRequest":"100m","memoryRequest":"100Mi","cpuLimit":"200m","memoryLimit":"200Mi"}}}' kind: ConfigMap metadata: creationTimestamp: "2025-05-21T08:25:08Z" labels: app.kubernetes.io/component: repository-maintenance-config app.kubernetes.io/instance: ts-dpa app.kubernetes.io/managed-by: oadp-operator openshift.io/oadp: "True" name: repository-maintenance-ts-dpa namespace: openshift-adp ownerReferences: - apiVersion: oadp.openshift.io/v1alpha1 blockOwnerDeletion: true controller: true kind: DataProtectionApplication name: ts-dpa uid: 99ae2b16-d8c6-48ad-bc63-2e5ed89fbb16 resourceVersion: "63042" uid: c7e4ad9e-f793-40d8-b163-3b17484ae370
velero args:
spec:
containers:
- args:
- server
- --features=EnableCSI
- --uploader-type=kopia
- --fs-backup-timeout=4h
- --restore-resource-priorities=securitycontextconstraints,customresourcedefinitions,klusterletconfigs.config.open-cluster-management.io,managedcluster.cluster.open-cluster-management.io,namespaces,roles,rolebindings,clusterrolebindings,klusterletaddonconfig.agent.open-cluster-management.io,managedclusteraddon.addon.open-cluster-management.io,storageclasses,volumesnapshotclass.snapshot.storage.k8s.io,volumesnapshotcontents.snapshot.storage.k8s.io,volumesnapshots.snapshot.storage.k8s.io,datauploads.velero.io,persistentvolumes,persistentvolumeclaims,serviceaccounts,secrets,configmaps,limitranges,pods,replicasets.apps,clusterclasses.cluster.x-k8s.io,endpoints,services,-,clusterbootstraps.run.tanzu.vmware.com,clusters.cluster.x-k8s.io,clusterresourcesets.addons.cluster.x-k8s.io
- --log-format=text
- --disable-informer-cache=false
- --repo-maintenance-job-configmap=repository-maintenance-ts-dpa
Expected results:
Repository maintenance pod should run on correct node.
Also, the resources requests and limits config should be respected.