-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
None
-
None
-
0
-
False
-
-
False
-
?
-
rhos-ops-day1day2-upgrades
-
None
-
-
-
-
Moderate
During RHOSO EDPM deployments, ansible jobs created for dataplane services fail to converge when an OpenStackDataPlaneDeployment is configured with multiple compute nodesets that have long names.
The job name is constructed using service name, deployment name and nodeset name which is truncated when exceeding 63 char limit.
Due to this truncation:
- Different nodesets resolve to the same Job name
- A single job is shared across multiple nodesets
- Job metadata (labels and hashes) corresponds to only one nodeset, while others also attempt to reconcile against it.
As reconciliation proceeds, each nodeset computes a different desired job hash. Because the job name is shared, the controller repeatedly detects hash changes while the job is still running, triggering continuous retries and job/pod re-creation. This results in endless job creations and reconcilation loops.
// all nodesets names have 31 chars [root@e18-h18-000-r660 ~]# for ns in $(oc get osdpns -n openstack -o name | grep -E "computer6[56]0"); do NAME=$(echo $ns | sed "s|.*/||") echo "$NAME (${#NAME} chars)" done openstack-edpm-computer650-set0 (31 chars) openstack-edpm-computer650-set1 (31 chars) openstack-edpm-computer650-set2 (31 chars) openstack-edpm-computer660-set0 (31 chars) openstack-edpm-computer660-set1 (31 chars) // number of jobs created is only one for pre-adoption-validation service for all nodesets due to truncation. // the actual names should be pre-adoption-validation-compute-pre-adoption-openstack-edpm-computer650-set0, ....computer650-set1. // suffixes that makes the name unique are being truncated.. resulting in single job creation. [root@e18-h18-000-r660 ~]# oc get jobs -n openstack | grep pre-adoption-validation-compute pre-adoption-validation-compute-pre-adoption-openstack-edpm-com Running 0/1 10s 10s // logs that report hash changes and recreates the jobs [root@e18-h18-000-r660 ~]# oc logs -n openstack-operators deployment/openstack-operator-controller-manager --since=120s | grep "hash.*changed" | head -n 5 2026-01-11T15:13:53.126Z INFO Controllers.OpenStackDataPlaneDeployment The hash of the job changed while the job was running, waiting for the previous job to finish before re-run.{"controller": "openstackdataplanedeployment", "controllerGroup": "dataplane.openstack.org", "controllerKind": "OpenStackDataPlaneDeployment", "OpenStackDataPlaneDeployment": {"name":"compute-pre-adoption","namespace":"openstack"}, "namespace": "openstack", "name": "compute-pre-adoption", "reconcileID": "8635ced3-2ad9-4ed7-af58-84aad35cb5c7"} 2026-01-11T15:13:53.126Z INFO Controllers.OpenStackDataPlaneDeployment The hash of the job changed while the job was running, waiting for the previous job to finish before re-run.{"controller": "openstackdataplanedeployment", "controllerGroup": "dataplane.openstack.org", "controllerKind": "OpenStackDataPlaneDeployment", "OpenStackDataPlaneDeployment": {"name":"compute-pre-adoption","namespace":"openstack"}, "namespace": "openstack", "name": "compute-pre-adoption", "reconcileID": "8635ced3-2ad9-4ed7-af58-84aad35cb5c7"} 2026-01-11T15:13:53.126Z INFO Controllers.OpenStackDataPlaneDeployment The hash of the job changed while the job was running, waiting for the previous job to finish before re-run.{"controller": "openstackdataplanedeployment", "controllerGroup": "dataplane.openstack.org", "controllerKind": "OpenStackDataPlaneDeployment", "OpenStackDataPlaneDeployment": {"name":"compute-pre-adoption","namespace":"openstack"}, "namespace": "openstack", "name": "compute-pre-adoption", "reconcileID": "8635ced3-2ad9-4ed7-af58-84aad35cb5c7"}