-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
4.19, 4.20, 4.21
-
None
Description of problem
For the past few days, multi-arch nightlies have been rejected for 4.19 and later. For example, 4.19.0-0.nightly-multi-2025-11-24-173202 failed its hypershift-e2e-aks-multi-x-ax blocking job on:
: TestNodePool/HostedCluster0/Main/TestNodePoolPrevReleaseN4 10m0s
{Failed === RUN TestNodePool/HostedCluster0/Main/TestNodePoolPrevReleaseN4
=== PAUSE TestNodePool/HostedCluster0/Main/TestNodePoolPrevReleaseN4
=== CONT TestNodePool/HostedCluster0/Main/TestNodePoolPrevReleaseN4
nodepool_prev_release_test.go:34: Starting NodePoolPrevReleaseCreateTest.
nodepool_test.go:349: NodePool version is outside supported skew, validating condition only (skipping node readiness check)
eventually.go:104: Failed to get *v1beta1.NodePool: client rate limiter Wait returned an error: rate: Wait(n=1) would exceed context deadline
nodepool_test.go:396: Failed to wait for NodePool e2e-clusters-w9xx6/node-pool-9xmt4-bw48s to have correct status in 10m0s: context deadline exceeded
eventually.go:224: observed *v1beta1.NodePool e2e-clusters-w9xx6/node-pool-9xmt4-bw48s invalid at RV 36483 after 10m0s: incorrect condition: wanted SupportedVersionSkew=False, got SupportedVersionSkew=True: AsExpected(Release image version is valid)
--- FAIL: TestNodePool/HostedCluster0/Main/TestNodePoolPrevReleaseN4 (600.04s)
Version-Release number of selected component
4.19, 4.20, and 4.21 multi nightlies are all failing.
How reproducible
Every time.
Steps to Reproduce
- Build a new multi nightly for 4.19 or later.
- Watch the blocking periodic-ci-openshift-hypershift-release-4.*-periodics-e2e-aks-multi-x-ax job fail.
Actual results
The TestNodePoolPrevReleaseN4 test-case fails to generate the expected skew warning.
Expected results
Passing CI and accepted nightlies.
Additional info
Poking at the test run, the issue seems to be release-controller image configuration getting wired to the test step:
- --e2e.latest-release-image is getting set to 4.19.0-0.nightly-multi-2025-11-24-173202 (quay.io/openshift-release-dev/ocp-release-nightly@sha256:871beb48e85e7c01096b51a9a94eaf9fd57b700f4cb1c96f09606643de1c7489), which is good.
- --e2e.previous-release-image is getting set to 4.19.0-0.nightly-multi-2025-11-20-065238 (quay.io/openshift-release-dev/ocp-release-nightly@sha256:0f91151c1e63d1efe0bb8f874c81a2ca883392e4cbdb006675150a8116be67fe), which is good.
- All the --e2e.n*-minor-release-image are getting set to the multi-arch 4.19.0-0.ci-2025-11-22-121352 (registry.ci.openshift.org/ocp/release@sha256:ff45c3946a04f54dfe24f6758d52a595410914e753a0e7ca439971d6f5021af3), instead of releases that are actually from the target 4.(y-n) streams.
$ curl -s curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/logs/periodic-ci-openshift-hypershift-release-4.19-periodics-e2e-aks-multi-x-ax/1993032723186323456/artifacts/e2e-aks-multi-x-ax/hypershift-azure-run-e2e/build-log.txt | grep -o '\-\-e2e[.][^ ]*' | grep sha256 | sort | uniq --e2e.latest-release-image=registry.build11.ci.openshift.org/ci-op-76577yvz/release@sha256:871beb48e85e7c01096b51a9a94eaf9fd57b700f4cb1c96f09606643de1c7489 --e2e.n1-minor-release-image=registry.build11.ci.openshift.org/ci-op-76577yvz/release@sha256:ff45c3946a04f54dfe24f6758d52a595410914e753a0e7ca439971d6f5021af3 --e2e.n2-minor-release-image=registry.build11.ci.openshift.org/ci-op-76577yvz/release@sha256:ff45c3946a04f54dfe24f6758d52a595410914e753a0e7ca439971d6f5021af3 --e2e.n3-minor-release-image=registry.build11.ci.openshift.org/ci-op-76577yvz/release@sha256:ff45c3946a04f54dfe24f6758d52a595410914e753a0e7ca439971d6f5021af3 --e2e.n4-minor-release-image=registry.build11.ci.openshift.org/ci-op-76577yvz/release@sha256:ff45c3946a04f54dfe24f6758d52a595410914e753a0e7ca439971d6f5021af3 --e2e.previous-release-image=registry.build11.ci.openshift.org/ci-op-76577yvz/release@sha256:0f91151c1e63d1efe0bb8f874c81a2ca883392e4cbdb006675150a8116be67fe
Using a 4.19 release when --e2e.n4-minor-release-image expects a 4.(19-4) = 4.15 release breaks the test-case's expectation of an SupportedVersionSkew=False condition.
Unclear to me if the release-controller has a way to correctly populate 4.(y-4) releases. You might need to clear these defaults:
release $ git grep -A1 OCP_IMAGE_N ci-operator/step-registry/hypershift | grep -- -ref.yaml ci-operator/step-registry/hypershift/aws/run-e2e/external/hypershift-aws-run-e2e-external-ref.yaml: - env: OCP_IMAGE_N1 ci-operator/step-registry/hypershift/aws/run-e2e/external/hypershift-aws-run-e2e-external-ref.yaml- name: release:latest ci-operator/step-registry/hypershift/aws/run-e2e/external/hypershift-aws-run-e2e-external-ref.yaml: - env: OCP_IMAGE_N2 ci-operator/step-registry/hypershift/aws/run-e2e/external/hypershift-aws-run-e2e-external-ref.yaml- name: release:latest ci-operator/step-registry/hypershift/aws/run-e2e/external/hypershift-aws-run-e2e-external-ref.yaml: - env: OCP_IMAGE_N3 ci-operator/step-registry/hypershift/aws/run-e2e/external/hypershift-aws-run-e2e-external-ref.yaml- name: release:latest ci-operator/step-registry/hypershift/aws/run-e2e/external/hypershift-aws-run-e2e-external-ref.yaml: - env: OCP_IMAGE_N4 ci-operator/step-registry/hypershift/aws/run-e2e/external/hypershift-aws-run-e2e-external-ref.yaml- name: release:latest ci-operator/step-registry/hypershift/azure/run-e2e/hypershift-azure-run-e2e-ref.yaml: - env: OCP_IMAGE_N1 ci-operator/step-registry/hypershift/azure/run-e2e/hypershift-azure-run-e2e-ref.yaml- name: release:latest ci-operator/step-registry/hypershift/azure/run-e2e/hypershift-azure-run-e2e-ref.yaml: - env: OCP_IMAGE_N2 ci-operator/step-registry/hypershift/azure/run-e2e/hypershift-azure-run-e2e-ref.yaml- name: release:latest ci-operator/step-registry/hypershift/azure/run-e2e/hypershift-azure-run-e2e-ref.yaml: - env: OCP_IMAGE_N3 ci-operator/step-registry/hypershift/azure/run-e2e/hypershift-azure-run-e2e-ref.yaml- name: release:latest ci-operator/step-registry/hypershift/azure/run-e2e/hypershift-azure-run-e2e-ref.yaml: - env: OCP_IMAGE_N4 ci-operator/step-registry/hypershift/azure/run-e2e/hypershift-azure-run-e2e-ref.yaml- name: release:latest
and then adjust the code that consumes them to leave them unset (or something that skips those skew tests), until folks figure out a way to get them wired to appropriate releases in the release controller?