Uploaded image for project: 'Red Hat Advanced Cluster Management'
  1. Red Hat Advanced Cluster Management
  2. ACM-29186

Migration hangs at Registering/Cleaning stages after manager restart due to uninitialized migration status

XMLWordPrintable

    • Quality / Stability / Reliability
    • 1
    • False
    • Hide

      None

      Show
      None
    • False
    • GH Train-36
    • Moderate
    • None

      Problem

      After restarting the manager or hub, migration operations hang indefinitely at Registering or Cleaning stages. The logs show warnings: "MigrationStatus is nil for migrationId: xxx".

      Root Causeh1. migrationStatuses is an in-memory map that is lost when manager restarts

      After restart, Kafka replays migration status events to the handler

      The handler calls SetFinished() but MigrationStatus was nil

      SetFinished() silently failed, so GetFinished() always returned false

      Controller kept waiting forever → timeout → rollback → fail

      Solution

      Add lazy initialization of migration status in the status handler by calling AddMigrationStatus() before any status updates.

      Impact

      - Migration operations fail after any manager restart
      - Requires manual intervention to recover
      - Affects production stability

      Fix Details

      PR: https://github.com/stolostron/multicluster-global-hub/pull/2269
      Changes: 4 lines added, 1 file changed
      Files Modified: manager/pkg/status/handlers/clustermigartion/managedclustermigration_handler.go

              daliu@redhat.com DangPeng Liu
              daliu@redhat.com DangPeng Liu
              Yaheng Liu Yaheng Liu
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: