Uploaded image for project: 'OpenStack as Infra'
  1. OpenStack as Infra
  2. OSASINFRA-4028

Address openstack-vh-mecha-central-quota-slice resource starvation in CI

XMLWordPrintable

    • Icon: Spike Spike
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • None
    • CSI
    • None
    • 5
    • False
    • Hide

      None

      Show
      None
    • False

      Background

      OpenStack-related CI jobs in openshift/csi-operator (and potentially other repositories) are experiencing failures due to resource pool exhaustion of the openstack-vh-mecha-central-quota-slice boskos lease.

      The current pool is limited to 3 concurrent leases, but multiple jobs compete for these resources simultaneously. Jobs that cannot acquire a lease within the ~2.5 hour timeout fail with:

      failed to acquire lease for "openstack-vh-mecha-central-quota-slice": resources not found
      

      Scope

      Investigate and recommend solutions to reduce or eliminate OpenStack CI job failures caused by lease contention.

      Investigation Areas

      1. Resource pool sizing

      • Determine if OpenStack infrastructure can support additional concurrent clusters
      • Evaluate impact of increasing pool size in generate-boskos.py
      • Identify cost/capacity constraints

      2. Test configuration optimization

      • Audit all repositories using openstack-vh-mecha-central cluster profile
      • Identify tests that could use run_if_changed filters to reduce trigger frequency
      • Evaluate which tests could be converted from presubmit to periodic

      3. Job duration analysis

      • Profile OpenStack test execution times
      • Identify opportunities to reduce test duration and lease hold time

      4. Alternative approaches

      • Evaluate job prioritization/queuing mechanisms (boskos is problematic in that sense)
      • Consider test consolidation to reduce total lease requirements
      • Investigate lease timeout configuration options

      References

      • Failed job example: PR #359 build log
      • Boskos config: core-services/prow/02_config/generate-boskos.py
      • CI config: ci-operator/config/openshift/csi-operator/openshift-csi-operator-main.yaml

              Unassigned Unassigned
              eshulman Ella Shulman
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: