Uploaded image for project: 'Red Hat Advanced Cluster Management'
  1. Red Hat Advanced Cluster Management
  2. ACM-27917

Argo CD status.operationState.syncResult[] reports stale resources after prune, causing ClusterInstance creation to be invisible during parallel SNO redeployments.

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Not a Bug
    • Icon: Major Major
    • None
    • None
    • SiteConfig Operator
    • None
    • False
    • Hide

      None

      Show
      None
    • False
    • contract-priority
    • None

      Description of problem:

      When redeploying two SNO clusters in parallel using ZTP + GitOps with the ClusterInstance workflow, the ClusterInstance resource for one cluster (cluster-A) is successfully created in the cluster but does not appear in the Argo CD Application status.operationState.syncResult[].resources[] for an extended period.

      Instead, syncResult[] continues to report a previously pruned ClusterInstance (cluster-B) from an earlier commit. This causes deployment automation—which relies on syncResult[]—to incorrectly mark the Argo CD sync as failed, even though the ClusterInstance was created successfully (verified via must-gather, audit logs, and resource creation timestamps).

      This behavior was not observed when using the SiteConfig workflow and started only after migrating to ClusterInstance.

      Version-Release number of selected component (if applicable):

      OpenShift: Single Node OpenShift (SNO)

      Red Hat Advanced Cluster Management (RHACM) with ZTP

      OpenShift GitOps / Argo CD

      Workflow: ClusterInstance (post SiteConfig deprecation)

      How reproducible:

      Repeatedly reproducible when two SNO clusters are detached and re-added in parallel using the same Argo CD application.

      Steps to Reproduce:

      Detach spoke cluster cluster-A from the hub (ClusterInstance manifest removed from Git).

      Detach spoke cluster cluster-B from the hub (ClusterInstance manifest removed from Git).

      Add spoke cluster cluster-A back by committing its ClusterInstance manifest to Git (commit commit-A).

      Deployment automation waits for Argo CD sync completion by checking status.operationState.syncResult[].resources[] for cluster-A.

      syncResult[] repeatedly reports only a pruned ClusterInstance for cluster-B, not cluster-A, for an extended period.

      Automation times out and marks deployment as failed.

      Add spoke cluster cluster-B back later via a new commit (commit-B), after which syncResult[] finally updates.

      Actual results:

      ClusterInstance cluster-A is created successfully (creation timestamp matches Argo CD sync time for commit-A).

      status.operationState.syncResult[].resources[] remains stale and continues to show a pruned ClusterInstance for cluster-B.

      ClusterInstance cluster-A does not appear in syncResult[] until a later sync occurs.

      Deployment automation fails due to missing resource in syncResult[].

      Expected results:

      After syncing commit-A, status.operationState.syncResult[].resources[] should reflect the resources created or updated by that commit, including ClusterInstance cluster-A.

      Previously pruned resources from earlier commits should not persist in syncResult[] for subsequent syncs.

      Argo CD application status should reliably represent the outcome of the most recent sync operation.

      Additional info:

      Audit logs, must-gather data, and cluster resource timestamps confirm that ClusterInstance cluster-A was created successfully during the Argo CD sync for commit-A.

      The issue occurs only after migrating from the SiteConfig workflow to ClusterInstance.

      Argo CD autosync is enabled; deployment automation does not modify Argo AppProjects during redeployments.

      The Argo AppProject does not whitelist InfraEnv or BareMetalHost resources.

      Suspected contributing factors include:

      status.operationState.syncResult[] retaining stale data after a prune that temporarily leaves the application managing zero resources.

      Race conditions when resources are pruned and immediately recreated in subsequent commits.

      Impact: causes frequent deployment automation failures during parallel SNO redeployments.

              sakhoury@redhat.com Sharat Akhoury
              rhn-support-mlele Mihir Lele
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: