Uploaded image for project: 'Red Hat OpenShift Control Planes'
  1. Red Hat OpenShift Control Planes
  2. CNTRLPLANE-1856

Configure Konflux Release Gating via Periodic Test Results

XMLWordPrintable

    • Icon: Story Story
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • None
    • HyperShift
    • None
    • None
    • None

      User Story

      As a HyperShift release engineer, I want the Konflux release pipeline to only release HyperShift
      Operator images that have passed all periodic compatibility tests, so that we further reduce the chance of Managed OpenShift breakages resulting from HyperShift Operator upgrades

      Acceptance Criteria

      • Konflux releaseplan configured with release.appstudio.openshift.io/auto-release: 'false'
      • Periodic jobs from CNTRLPLANE-1848 and CNTRLPLANE-1854 use latest Konflux snapshot (not release)
      • Jobs retrieve snapshot using: {{oc get snapshot --sort-by=.metadata.creationTimestamp -l
        pac.test.appstudio.openshift.io/original-prname=hypershift-operator-main-on-push}} (will need some read only credentials to the Konflux cluster)
      • When all periodic jobs pass, automation creates manual Release object for tested snapshot
      • Failed periodic jobs mean no Konflux release creation
      • Release process documented in team contrib/konflux documentation
      • Notifications sent when releases are created

      Technical Details

      Konflux ReleasePlan Configuration

      Update the HyperShift Operator releaseplan to disable auto-release:

        apiVersion: appstudio.redhat.com/v1alpha1
        kind: ReleasePlan
        metadata:name: 
          name: hypershift-operator
          namespace: crt-redhat-acm-tenant
          labels: 
            release.appstudio.openshift.io/auto-release: 'false'
        spec: 
          target: rhtap-releng-tenant
          application: hypershift-operator
          # ... rest of releaseplan config
        

      This prevents automatic releases when builds complete.

      Periodic Job Snapshot Selection

      Update all periodic jobs to use latest snapshot instead of latest release:

        # Get latest snapshot for hypershift-operator-main-on-push
        SNAPSHOT=$(oc get snapshot \
          --sort-by=.metadata.creationTimestamp \
          -l pac.test.appstudio.openshift.io/original-prname=hypershift-operator-main-on-push \
          -o jsonpath='{.items[-1].metadata.name}')
      
        # Extract HO image from snapshot
        HO_IMAGE=$(oc get snapshot $SNAPSHOT -o jsonpath='{.spec.components[?(@.name=="hypershift-operator-main")].containerImage}')
      
        # Use HO_IMAGE in test execution
        

      Release Creation Workflow

      Automated Release Creation (when all tests pass):

      1. Monitor periodic job results for latest snapshot
      2. When all required jobs pass (baseline, .0→latest, latest-1→latest for AWS and Azure):
        1. Retrieve tested snapshot name
        2. Create Release object referencing snapshot
      3. Konflux processes Release and publishes images
      4. Notify team of successful release This can use the https://konflux.pages.redhat.com/docs/users/releasing/tenant-release-pipelines.html#final-pipeline with something like https://github.com/konflux-ci/community-catalog/tree/development/pipelines/update-jira-issues-pipeline 

      Example Release Object:

        apiVersion: appstudio.redhat.com/v1alpha1
        kind: Release
        metadata: 
          generateName: hypershift-operator-
          namespace: crt-redhat-acm-tenant
        spec: 
          snapshot: <snapshot-name-that-passed-tests>
          releasePlan: hypershift-operator-progressive
          gracePeriodDays: 7
        

      Reference: https://konflux.pages.redhat.com/docs/users/releasing/create-release.html#creating-a-release-object 

      Required Periodic Jobs for Release Gate

      AWS Jobs (from CNTRLPLANE-1848):

      • Baseline: periodic-ci-openshift-hypershift-release-4.Y-periodics-hcm-e2e-aws-ovn
      • Upgrade .0→latest:
        periodic-ci-openshift-hypershift-release-4.Y-periodics-hcm-upgrade-dot-zero-to-latest-aws-ovn
      • Upgrade latest-1→latest:
        periodic-ci-openshift-hypershift-release-4.Y-periodics-hcm-upgrade-latest-1-to-latest-aws-ovn

      Azure Jobs (from CNTRLPLANE-1854):

      • Baseline: periodic-ci-openshift-hypershift-release-4.Y-periodics-hcm-e2e-azure
      • Upgrade .0→latest:
        periodic-ci-openshift-hypershift-release-4.Y-periodics-hcm-upgrade-dot-zero-to-latest-azure
      • Upgrade latest-1→latest:
        periodic-ci-openshift-hypershift-release-4.Y-periodics-hcm-upgrade-latest-1-to-latest-azure

      All jobs for at all supported OCP releases must pass before release (only pre-release / non-GA are allowed to fail).

      Snapshot Tracking in Jobs

      Each periodic job needs to:

      1. Record which snapshot it tested
      2. Report results tied to snapshot
      3. Make results query-able for release automation

      Add to job configuration:

        env: 
          - name: KONFLUX_SNAPSHOT
            value: "<snapshot-from-query>"
          - name: REPORT_SNAPSHOT_RESULTS
            value: "true"
        

      Benefits of This Approach

      Quality Assurance:

      • Test images on platform before integration environment deployment
      • Every release has passed comprehensive compatibility tests
      • Automatic blocking of broken builds

      Traceability:

      • Clear link between test results and releases
      • Easy to identify which tests a release passed
      • Audit trail of what was tested before release

      Risk Reduction:

      • Prevents shipping regressions to customers
      • Catches compatibility issues before they reach production
      • Aligns with managed service quality requirements

      Implementation Phases

      Phase 1: Update Periodic Jobs (After CNTRLPLANE-1851-1854)

      • Modify jobs to use latest snapshot
      • Add snapshot tracking to job results
      • Validate jobs work with snapshots

      Phase 2: Implement Release Automation 

      • Create automation to monitor job results
      • Implement Release object creation
      • Set up notifications
      • Create runbooks for troubleshooting
      • Update releaseplan with auto-release: false
      • Verify no automatic releases occur

      Manual Override Process

      In emergency situations, allow manual releases without full test suite:

        # Get latest snapshot
        SNAPSHOT=$(oc get snapshot --sort-by=.metadata.creationTimestamp \
          -l pac.test.appstudio.openshift.io/original-prname=hypershift-operator-main-on-push \
          -o jsonpath='{.items[-1].metadata.name}')
      
        # Create release manually (requires approval/justification)
        cat <<EOF | oc create -f -
        apiVersion: appstudio.redhat.com/v1alpha1
        kind: Release
        metadata:
          generateName: hypershift-operator-emergency-
          namespace: <workspace-namespace>
          annotations:
            release.hypershift.io/emergency: "true"
            release.hypershift.io/justification: "Critical security fix"
        spec:
          snapshot: $SNAPSHOT
          releasePlan: hypershift-operator-release
        EOF
        

      Monitoring and Alerts

      Metrics to Track:

      • Time between snapshot creation and release
      • Percentage of snapshots that pass all tests
      • Most common test failures blocking releases
      • Emergency releases vs. normal releases

      Alerts to Configure:

      • Snapshot blocked for >24 hours (investigate test failures)
      • No releases in >3 days (may indicate systemic issues)
      • Emergency release created (requires review)

      Dependencies

      Success Metrics

      • Zero automatic releases after auto-release disabled
      • 100% of releases have passed compatibility tests
      • <24 hour latency from successful tests to release
      • <5% emergency override rate
      • Clear audit trail for all releases

      Documentation Requirements

      Create documentation for:

      • How release gating works (architecture diagram)
      • How to check which snapshot is being tested
      • How to view test results for a snapshot
      • How to manually trigger release (emergency process)
      • How to investigate blocked releases
      • How to add/remove required tests from gate

      Future Enhancements

      • Progressive rollout based on test results (canary releases)
      • Automatic rollback if post-release issues detected
      • Integration with CVE scanning and security gates
      • Cross-repository coordination (hypershift + operators)
      • Customer-specific testing gates for managed services

              Unassigned Unassigned
              asegurap1@redhat.com Antoni Segura Puimedon
              None
              None
              None
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: