Uploaded image for project: 'Migration Toolkit for Virtualization'
  1. Migration Toolkit for Virtualization
  2. MTV-1530

forklift controller crashing with error "fatal error: concurrent map read and map write"

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Critical Critical
    • 2.7.0
    • 2.6.7
    • Controller
    • None
    • False
    • None
    • False
    • Important

      Description of problem:

      On customer's environment, we are seeing forklift controller main container crashing with error below:

      2024-09-24T19:53:48.771988483Z fatal error: concurrent map read and map write
      2024-09-24T19:53:48.775136464Z
      2024-09-24T19:53:48.775136464Z goroutine 207 [running]:
      2024-09-24T19:53:48.775143344Z github.com/konveyor/forklift-controller/pkg/monitoring/metrics/forklift-controller.processMigration({{{0x28229dc, 0x9}, {0xc00618b540, 0x1c}}, {{0xc000f9af30, 0x26}, {0xc000f9af00, 0x21}, {0xc000acb800, 0x7}, ...}, ...}, ...)

       

      I can see two "goroutine" on "RecordMigrationMetrics":

      2024-09-24T19:53:48.775136464Z goroutine 207 [running]:
      2024-09-24T19:53:48.775143344Z github.com/konveyor/forklift-controller/pkg/monitoring/metrics/forklift-controller.processMigration({{{0x28229dc, 0x9}, {0xc00618b540, 0x1c}}, {{0xc000f9af30, 0x26}, {0xc000f9af00, 0x21}, {0xc000acb800, 0x7}, ...}, ...}, ...)
      2024-09-24T19:53:48.775154761Z  /remote-source/app/pkg/monitoring/metrics/forklift-controller/migration_metrics.go:97 +0x1b5
      2024-09-24T19:53:48.775158743Z github.com/konveyor/forklift-controller/pkg/monitoring/metrics/forklift-controller.RecordMigrationMetrics.func1()
      2024-09-24T19:53:48.775158743Z  /remote-source/app/pkg/monitoring/metrics/forklift-controller/migration_metrics.go:70 +0x3ac
      2024-09-24T19:53:48.775163255Z created by github.com/konveyor/forklift-controller/pkg/monitoring/metrics/forklift-controller.RecordMigrationMetrics in goroutine 1
      2024-09-24T19:53:48.775163255Z  /remote-source/app/pkg/monitoring/metrics/forklift-controller/migration_metrics.go:21 +0x65
      2024-09-24T19:53:48.775171150Z

       

      2024-09-24T19:53:48.775263359Z goroutine 206 [runnable]:
      2024-09-24T19:53:48.775267266Z k8s.io/apimachinery/pkg/apis/meta/v1.(*Time).DeepCopy(...)
      2024-09-24T19:53:48.775270634Z  /remote-source/app/vendor/k8s.io/apimachinery/pkg/apis/meta/v1/zz_generated.deepcopy.go:1099
      2024-09-24T19:53:48.775270634Z github.com/konveyor/forklift-controller/pkg/apis/forklift/v1beta1/plan.(*Timed).DeepCopyInto(0xc00068c620?, 0xc0033fca10)
      2024-09-24T19:53:48.775274176Z  /remote-source/app/pkg/apis/forklift/v1beta1/plan/zz_generated.deepcopy.go:269 +0xed
      2024-09-24T19:53:48.775279094Z github.com/konveyor/forklift-controller/pkg/apis/forklift/v1beta1/plan.(*Task).DeepCopyInto(0xc00068c620, 0xc0033fca10)
      2024-09-24T19:53:48.775279094Z  /remote-source/app/pkg/apis/forklift/v1beta1/plan/zz_generated.deepcopy.go:234 +0x78
      2024-09-24T19:53:48.775282711Z github.com/konveyor/forklift-controller/pkg/apis/forklift/v1beta1/plan.(*Step).DeepCopyInto(0xc000fdae10, 0xc0034045a0)
      2024-09-24T19:53:48.775282711Z  /remote-source/app/pkg/apis/forklift/v1beta1/plan/zz_generated.deepcopy.go:215 +0x179
      2024-09-24T19:53:48.775286409Z github.com/konveyor/forklift-controller/pkg/apis/forklift/v1beta1/plan.(*VMStatus).DeepCopyInto(0xc001002820, 0xc0033f36c0)
      2024-09-24T19:53:48.775290009Z  /remote-source/app/pkg/apis/forklift/v1beta1/plan/zz_generated.deepcopy.go:317 +0x4f6
      2024-09-24T19:53:48.775293532Z github.com/konveyor/forklift-controller/pkg/apis/forklift/v1beta1/plan.(*MigrationStatus).DeepCopyInto(0xc007900b58, 0xc00340a758)
      2024-09-24T19:53:48.775293532Z  /remote-source/app/pkg/apis/forklift/v1beta1/plan/zz_generated.deepcopy.go:96 +0x245
      2024-09-24T19:53:48.775297183Z github.com/konveyor/forklift-controller/pkg/apis/forklift/v1beta1.(*PlanStatus).DeepCopyInto(0xc007900b18, 0xc00340a718)
      2024-09-24T19:53:48.775300715Z  /remote-source/app/pkg/apis/forklift/v1beta1/zz_generated.deepcopy.go:752 +0x73
      2024-09-24T19:53:48.775300715Z github.com/konveyor/forklift-controller/pkg/apis/forklift/v1beta1.(*Plan).DeepCopyInto(0xc007900800, 0xc00340a400)
      2024-09-24T19:53:48.775304322Z  /remote-source/app/pkg/apis/forklift/v1beta1/zz_generated.deepcopy.go:665 +0xf4
      2024-09-24T19:53:48.775307997Z github.com/konveyor/forklift-controller/pkg/apis/forklift/v1beta1.(*Plan).DeepCopy(...)
      2024-09-24T19:53:48.775307997Z  /remote-source/app/pkg/apis/forklift/v1beta1/zz_generated.deepcopy.go:675
      2024-09-24T19:53:48.775311662Z github.com/konveyor/forklift-controller/pkg/apis/forklift/v1beta1.(*Plan).DeepCopyObject(0xc007900800)
      2024-09-24T19:53:48.775311662Z  /remote-source/app/pkg/apis/forklift/v1beta1/zz_generated.deepcopy.go:681 +0x3a
      2024-09-24T19:53:48.775315415Z sigs.k8s.io/controller-runtime/pkg/cache/internal.(*CacheReader).Get(0xc000a28be0, {0x508c360?, 0x321da99?}, {{0xc000d17978?, 0x31f93ae?}, {0xc000bb1cc0?, 0x280549a?}}, {0x36c6d60, 0xc00340a000}, {0x0, ...})
      2024-09-24T19:53:48.775328659Z  /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/cache/internal/cache_reader.go:88 +0x135
      2024-09-24T19:53:48.775332077Z sigs.k8s.io/controller-runtime/pkg/cache.(*informerCache).Get(0xc000918288, {0x36a4ad8, 0x508c360}, {{0xc000d17978?, 0x5087e48?}, {0xc000bb1cc0?, 0x5?}}, {0x36c6d60?, 0xc00340a000?}, {0x0, ...})
      2024-09-24T19:53:48.775345061Z  /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/cache/informer_cache.go:88 +0x1f7
      2024-09-24T19:53:48.775359610Z sigs.k8s.io/controller-runtime/pkg/client.(*client).Get(0xc0008226c0, {0x36a4ad8, 0x508c360}, {{0xc000d17978?, 0x17?}, {0xc000bb1cc0?, 0x7?}}, {0x36c6d60?, 0xc00340a000?}, {0x0, ...})
      2024-09-24T19:53:48.775368330Z  /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/client/client.go:348 +0x491
      2024-09-24T19:53:48.775371748Z github.com/konveyor/forklift-controller/pkg/monitoring/metrics/forklift-controller.RecordMigrationMetrics.func1()
      2024-09-24T19:53:48.775371748Z  /remote-source/app/pkg/monitoring/metrics/forklift-controller/migration_metrics.go:37 +0x1e3
      2024-09-24T19:53:48.775375332Z created by github.com/konveyor/forklift-controller/pkg/monitoring/metrics/forklift-controller.RecordMigrationMetrics in goroutine 1
      2024-09-24T19:53:48.775375332Z  /remote-source/app/pkg/monitoring/metrics/forklift-controller/migration_metrics.go:21 +0x65
      2024-09-24T19:53:48.775379323Z

       

      I can see that both the migration  and plan controller is calling RecordMigrationMetrics which in turn creates new goroutines that run continuously in parallel without synchronization. If I understand this correctly, a race condition is occurring here  when one goroutine reads from the map while another writes causing error "concurrent map read and map write error". 

      Version-Release number of selected component (if applicable):

      mtv-operator.v2.6.7    

      How reproducible:

      Observed in customer's environment    

      Steps to Reproduce:

          1.
          2.
          3.
          

      Actual results:

       forklift controller crashing with error "fatal error: concurrent map read and map write"   

      Expected results:

          

      Additional info:

          

              Unassigned Unassigned
              rhn-support-nashok Nijin Ashok
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: