-
Story
-
Resolution: Done
-
Undefined
-
None
-
None
-
Future Sustainability
-
False
-
-
False
-
None
-
None
-
None
-
None
User Story
As a managed service SRE, I want periodic jobs that validate clusters can upgrade from latest-1 Z-stream to latest Z-stream without triggering NodePool rollouts, so that I can perform rolling Z-stream upgrades for customers safely.
Acceptance Criteria
- Periodic jobs created for each supported OCP minor version (4.16+)
- Jobs run existing TestUpgradeControlPlane test with latest-1 → latest parameters
- Jobs run daily
- Tests validate same invariants as
CNTRLPLANE-1852(no NodePool rollouts) - Tests validate control plane successfully upgrades
- Tests validate cluster remains functional post-upgrade
- Job configuration tracks "latest-1" as new Z-streams are released
Technical Details
Same Test as CNTRLPLANE-1852
This story uses the exact same TestUpgradeControlPlane test, just with different version parameters:
- Story 1.2: PREVIOUS_RELEASE_IMAGE=4.Y.0, LATEST_RELEASE_IMAGE=4.Y.latest
- Story 1.3: PREVIOUS_RELEASE_IMAGE=4.Y.latest-1, LATEST_RELEASE_IMAGE=4.Y.latest
All validation logic is identical.
Version Selection Strategy
Static Configuration (MVP):
- Manually specify latest-1 version in job config
- Update when new Z-streams are released
- Example: When 4.20.16 is released:
- PREVIOUS_RELEASE_IMAGE: 4.20.15 (was 4.20.14)
- LATEST_RELEASE_IMAGE: 4.20.16 (was 4.20.15)
We should try to get this automatically.
CI Operator Config
- Path: ci-operator/config/openshift/hypershift/openshift-hypershift-release-4.Y__periodics-hcm-upgrade.yaml
- Or separate file: openshift-hypershift-release-4.Y__periodics-hcm-upgrade-latest-1.yaml
- Same workflow as Story 1.2, different version parameters
Periodic Job
- Name: periodic-ci-openshift-hypershift-release-4.Y-periodics-hcm-upgrade-latest-1-to-latest-aws-ovn
- Interval: Daily
Version Update Process (manual - should be automated
When new Z-stream is released (e.g., 4.20.16):
1. Update PREVIOUS_RELEASE_IMAGE to previous latest (4.20.15)
2. Update LATEST_RELEASE_IMAGE to new latest (4.20.16)
3. Submit PR to openshift/release
4. Coordinate with CNTRLPLANE-1852 job updates (can be same PR)
5. Validate job runs successfully
Comparison with .0 → Latest Upgrades
Useful to compare results between jobs:
- .0 → latest (Story 1.2): Large upgrade, many Z-stream changes
- latest-1 → latest (Story 1.3): Small upgrade, single Z-stream increment
Failure Analysis:
- If .0 → latest passes but latest-1 → latest fails:
- Indicates regression in latest Z-stream
- Immediate investigation required
- If .0 → latest fails but latest-1 → latest passes:
- Indicates issue specific to older .0 version
- May be expected if .0 is very old
Success Metrics
- <5% false failure rate
- Upgrade completes within 20 minutes
- All validation checks pass (built into TestUpgradeControlPlane)
- Clear diagnostics available on failure
- Results tracked separately from .0 → latest tests
Coordination
- Align version updates with Story 1.2 (can update both jobs in single PR)
- Ensure both jobs testing same "latest" version