-
Bug
-
Resolution: Not a Bug
-
Major
-
None
-
None
-
None
-
False
-
-
False
-
-
-
-
contract-priority
-
None
Description of problem:
When redeploying two SNO clusters in parallel using ZTP + GitOps with the ClusterInstance workflow, the ClusterInstance resource for one cluster (cluster-A) is successfully created in the cluster but does not appear in the Argo CD Application status.operationState.syncResult[].resources[] for an extended period.
Instead, syncResult[] continues to report a previously pruned ClusterInstance (cluster-B) from an earlier commit. This causes deployment automation—which relies on syncResult[]—to incorrectly mark the Argo CD sync as failed, even though the ClusterInstance was created successfully (verified via must-gather, audit logs, and resource creation timestamps).
This behavior was not observed when using the SiteConfig workflow and started only after migrating to ClusterInstance.
Version-Release number of selected component (if applicable):
OpenShift: Single Node OpenShift (SNO)
Red Hat Advanced Cluster Management (RHACM) with ZTP
OpenShift GitOps / Argo CD
Workflow: ClusterInstance (post SiteConfig deprecation)
How reproducible:
Repeatedly reproducible when two SNO clusters are detached and re-added in parallel using the same Argo CD application.
Steps to Reproduce:
Detach spoke cluster cluster-A from the hub (ClusterInstance manifest removed from Git).
Detach spoke cluster cluster-B from the hub (ClusterInstance manifest removed from Git).
Add spoke cluster cluster-A back by committing its ClusterInstance manifest to Git (commit commit-A).
Deployment automation waits for Argo CD sync completion by checking status.operationState.syncResult[].resources[] for cluster-A.
syncResult[] repeatedly reports only a pruned ClusterInstance for cluster-B, not cluster-A, for an extended period.
Automation times out and marks deployment as failed.
Add spoke cluster cluster-B back later via a new commit (commit-B), after which syncResult[] finally updates.
Actual results:
ClusterInstance cluster-A is created successfully (creation timestamp matches Argo CD sync time for commit-A).
status.operationState.syncResult[].resources[] remains stale and continues to show a pruned ClusterInstance for cluster-B.
ClusterInstance cluster-A does not appear in syncResult[] until a later sync occurs.
Deployment automation fails due to missing resource in syncResult[].
Expected results:
After syncing commit-A, status.operationState.syncResult[].resources[] should reflect the resources created or updated by that commit, including ClusterInstance cluster-A.
Previously pruned resources from earlier commits should not persist in syncResult[] for subsequent syncs.
Argo CD application status should reliably represent the outcome of the most recent sync operation.
Additional info:
Audit logs, must-gather data, and cluster resource timestamps confirm that ClusterInstance cluster-A was created successfully during the Argo CD sync for commit-A.
The issue occurs only after migrating from the SiteConfig workflow to ClusterInstance.
Argo CD autosync is enabled; deployment automation does not modify Argo AppProjects during redeployments.
The Argo AppProject does not whitelist InfraEnv or BareMetalHost resources.
Suspected contributing factors include:
status.operationState.syncResult[] retaining stale data after a prune that temporarily leaves the application managing zero resources.
Race conditions when resources are pruned and immediately recreated in subsequent commits.
Impact: causes frequent deployment automation failures during parallel SNO redeployments.