-
Story
-
Resolution: Unresolved
-
Major
-
None
-
None
-
None
-
Quality / Stability / Reliability
-
False
-
-
False
-
None
-
5
-
None
-
None
-
None
Problem Statement
WMCO upgrade testing on Jenkins (https://jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/winc/job/winc-upgrade/) has proper CSV version verification, but the official OpenShift Prow CI does not.
The cucushift-winc-upgrade step in openshift/release repository (used by Prow periodic jobs) only checks health status, not CSV version change:
# Current verification (ci-operator/step-registry/cucushift/winc/upgrade/cucushift-winc-upgrade-commands.sh) oc wait csv --all --for=jsonpath='{.status.phase}'=Succeeded oc wait deployment windows-machine-config-operator --for condition=Available=True oc wait nodes -l kubernetes.io/os=windows --for condition=Ready=True
This creates a gap where:
- Jenkins testing (non-official) catches upgrade failures ✅
- Prow CI testing (official) misses upgrade failures ❌
Risk
Prow periodic jobs could pass even if WMCO operator didn't upgrade:
- Old CSV (10.20) still running in Succeeded state
- Operator never upgraded to new version (10.21)
- Silent upgrade failures (InstallPlan creation failed, CSV upgrade failed, etc.)
Context
- Step created by jfrancoa (Dec 2023), left company 2 years ago
- OTA team confirmed they don't maintain this step (owned by cucushift/winc team)
- OTA team uses CSV version verification for their own operator upgrades
- Jenkins job already has proper verification
Evidence of Best Practice
- OCP-43832 test has proper verification:
- https://github.com/openshift/openshift-tests-private/blob/master/test/extended/winc/winc.go#L2156-L2215
- Captures old CSV before upgrade
- Waits for new CSV (different from old)
- Verifies all Windows nodes have matching version annotations
- optional-operators-ci-upgrade step in openshift/release verifies CSV version:
- ci-operator/step-registry/optional-operators/ci/upgrade/optional-operators-ci-upgrade-commands.sh
- Line 22: if [[ "$CSV" == "${OO_LATEST_CSV}" ]]; then
- Jenkins WMCO upgrade job has proper verification (non-official but proven pattern)
Scope
This enhancement is in the openshift/release repository and is independent from:
- ✅ WINC-1484 (Skyler's 4.19→4.20 variant config using operator-sdk)
- ✅ PR #73920 (BYOH provisioning support)
This benefits all Prow QE upgrade periodic jobs across all platforms/versions that use the openshift-upgrade-qe-test-winc chain.
Goal
Bring Prow CI WMCO upgrade verification up to the same standard as Jenkins.
Acceptance Criteria
- Step captures WMCO CSV name/version before cluster upgrade
- Step verifies CSV name changed after cluster upgrade (old ≠ new)
- Step fails with clear error if CSV did not upgrade
- Step verifies all Windows nodes have version annotations matching the new CSV version
- Step fails with clear error listing any nodes with incorrect versions
- On failure, step dumps subscription, InstallPlan, and CSV resources for troubleshooting
- Existing periodic upgrade jobs work without modification
- Works across all platforms (AWS, Azure, GCP, vSphere, Nutanix)
Implementation Approach
Repository: openshift/release
Required Workflow:
- Pre-upgrade phase (before cluster upgrade starts):
- Query WMCO subscription to get current CSV name
- Query current CSV to get version
- Save both to SHARED_DIR for later verification
- Log pre-upgrade state for debugging
- Post-upgrade phase (after cluster upgrade completes):
- Wait for CSV to reach Succeeded state (preserve existing behavior)
- Wait for deployment to be Available (preserve existing behavior)
- Query subscription to get new CSV name
- Compare new CSV name with saved old CSV name
- If CSV names are identical, fail with detailed error
- Extract version from new CSV
- Query all Windows nodes for version annotations
- Verify all node annotations match new CSV version
- If any node has wrong version, fail with node-specific error
- Wait for nodes to be Ready (preserve existing behavior)
- Error reporting:
- On CSV verification failure: dump subscription, InstallPlan, and CSV resources
- On node verification failure: dump node details with version mismatches
- Include clear error messages and troubleshooting hints
Files Expected to Change:
- ci-operator/step-registry/cucushift/winc/upgrade/cucushift-winc-upgrade-commands.sh (required)
- Potentially: openshift-upgrade-qe-test-winc chain if using separate pre-step
- Generated metadata files (updated via make update)
OWNERS
Current owners of ci-operator/step-registry/cucushift/winc/upgrade/:
- Approvers: jianlinliu, gpei, yunjiang29
- Reviewers: jfrancoa (left company), rrasouli