Uploaded image for project: 'OpenShift Windows Containers'
  1. OpenShift Windows Containers
  2. WINC-1536

Stabilize and Migrate WINC Tests to OpenShift Tests Extension (OTE)

XMLWordPrintable

    • Stabilize and Migrate WINC Tests to OTE
    • To Do
    • Quality / Stability / Reliability
    • 100% To Do, 0% In Progress, 0% Done
    • False
    • Hide

      None

      Show
      None
    • False
    • Not Selected
    • None
    • None
    • None

      Background

      The WINC test suite (49 tests in openshift-tests-private/test/extended/winc/) needs to be migrated to OpenShift Tests Extension (OTE) framework. However, OTE requires tests to maintain >=99% pass rate before they can be included in conformance suites.

      Current State:

      • 49 Ginkgo-based integration tests
      • Tests currently do not meet 99% pass rate requirement
      • PR openshift/release#71726 merged - adds more platform periodic CI coverage
      • Already using compat_otp library (good foundation for OTE)

      OTE Requirements:
      Per the OTE Integration Guide, tests must pass at >=99% before inclusion in conformance suites.

      Related Work:

      Strategy

      This epic is divided into TWO SEQUENTIAL PHASES:

      Phase 1: Test Stabilization (MUST complete first)

      • Monitor new platform test results from PR #71726
      • Fix flaky tests and platform-specific issues
      • Achieve >=99% pass rate across all platforms
      • Validate sustained reliability for 2+ weeks

      Estimated Duration: 10-12 weeks

      Phase 2: OTE Migration (Blocked by Phase 1)

      • Create OTE binary (cmd/winc-tests/main.go)
      • Add required annotations ([OTP],[Jira:Windows_Containers])
      • Define test suites (parallel, serial, storage)
      • Register in origin's extension registry
      • Validate in CI with maintained >=99% pass rate

      Estimated Duration: 2-3 weeks

      Goals

      1. Achieve and maintain >=99% test pass rate on all platforms
      2. Migrate to OTE framework
      3. Enable component team ownership
      4. Support automatic CI integration
      5. Improve test execution efficiency

      Success Metrics

      • All tests at >=99% pass rate for 14+ consecutive days
      • Successful OTE migration with maintained reliability
      • Tests run in >=3 CI job variants (AWS, Azure, vSphere)
      • Zero regression in pass rate post-migration
      • Tests properly categorized in TestGrid

      Reference Documentation

      Phase 1: Test Stabilization

      Phase 1A: Monitor New Platform Test Results (Weeks 1-2)

      • Identify new platforms/variants from PR #71726]
      • Create TestGrid bookmarks and Sippy queries
      • Collect baseline failure data over 2 weeks
      • Create failure matrix (test × platform)
      • Triage and categorize failures

      Phase 1B: Fix Failures and Achieve 99% (Weeks 3-10)

      • Fix test bugs and quick wins (timing, race conditions, retries)
      • Fix platform-specific issues (AWS, Azure, vSphere, GCP, Nutanix, None)
      • Address product bugs (file Jiras, coordinate with dev team)
      • Monitor for 2 weeks sustained >=99% pass rate
      • Document baseline metrics

      Phase 2: OTE Migration

      Infrastructure Setup

      • Vendor github.com/openshift-eng/openshift-tests-extension
      • Create cmd/winc-tests/main.go CLI binary
      • Initialize extension and build test specs
      • Register OTE subcommands

      Test Compliance

      • Add [OTP] tracking tags to all 49 tests
      • Add [Jira:Windows_Containers] ownership tags
      • Add [Level0] tags to conformance tests
      • Verify test name stability

      Suite Organization

      • Define windows/all suite (all 49 tests)
      • Define windows/conformance/parallel suite
      • Define windows/conformance/serial suite
      • Define windows/storage suite
      • Add platform restrictions

      Build & Distribution

      • Configure Makefile for OTE binary build
      • Update Dockerfile to include binary
      • Generate bindata.go for test resources

      CI Integration

      • Register binary in origin's extension registry
      • Verify automatic CI job inclusion
      • Monitor TestGrid results
      • Update ci-test-mapping

      Documentation

      • Update README with OTE usage
      • Document suite structure
      • Document platform requirements

      Dependencies

      • Phase 2 is BLOCKED by Phase 1 completion
      • Related to WINC-1508 (parallel execution optimization)

      Risks & Mitigation

      Risk Impact Mitigation
      New platform failures delay stabilization High Early monitoring, rapid triage
      Product bugs block progress High Mark tests as informing if needed
      Tests break during OTE migration Medium Thorough local testing first
      CI jobs don't pick up new binary Medium Use multi-PR testing

      Total Estimated Effort

      • Phase 1: 10-12 weeks
      • Phase 2: 2-3 weeks
      • Total: 12-15 weeks (3-4 months)

              Unassigned Unassigned
              rrasouli Aharon Rasouli
              None
              None
              None
              None
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: