Uploaded image for project: 'Migration Toolkit for Virtualization'
  1. Migration Toolkit for Virtualization
  2. MTV-2020

Add Performance Test to Replicate forklift-controller OOM During High-Volume Migration

XMLWordPrintable

    • Icon: Task Task
    • Resolution: Unresolved
    • Icon: Minor Minor
    • None
    • None
    • Scale&Perf-QE
    • False
    • None
    • False

      Intro

      We have observed the forklift-controller go into an OOM (Out Of Memory) state and ultimately enter CrashLoopBackoff when migrating a large number of VMs (e.g., 100+ VMs across 20+ plans) from VMware. We need to create a new performance test to replicate these conditions systematically to measure resource usage, identify bottlenecks, and ensure forklift-controller stability during high-volume migrations.

      Background

      Environment

      • OCP+Virt clusters running as guest VMs on top of OCP+Virt deployed on IBM bare metal.
      • Mostly external Ceph storage (external ODF) in use.

      Observed Issues

      • forklift-controller OOM with 40–100 VM concurrent migrations.
      • forklift-volume-populator-controller complaining about PVC references that “no longer exist.”
      • Alerts related to ODF external Ceph performance degradation, network saturation from re-transmissions, and RBD plugin provisioner struggling to keep up with the provisioning load.

              dvaanunu@redhat.com David Vaanunu
              nrozen@redhat.com Nir Rozen
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: