-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
4.22
-
None
-
False
-
-
None
-
Critical
-
None
-
None
-
None
-
Approved
-
OCP Node Core Sprint 284
-
1
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Initially this was found as a install failure where the network operator complains that multus is still awaiting 1-2 nodes on gcp. However the situation has now evolved to where we can see the underlying issue on many tests, jobs, and platforms:
Error: reading image "92c3c9025fe657a9160372a805bfd1624fb087262f2e3db326f937c298d47d70": locating image with ID "92c3c9025fe657a9160372a805bfd1624fb087262f2e3db326f937c298d47d70": image not known
Comments below show job runs where this is being seen.
Scanning bigquery for tests reporting this in their output:
SELECT modified_time, prowjob_build_id, test_name, success, test_id, branch, prowjob_name, failure_content FROM `openshift-gce-devel.ci_analysis_us.junit` WHERE success = falseAND modified_time BETWEEN DATETIME("2026-02-10") AND DATETIME("2026-02-25") AND failure_content like '%image not known%' order by modified_time asc LIMIT 1000
Shows again, explosion of this problem on Feb 19. It is happening only in 4.22 jobs and a few presubmits, ruling out infrastructure or registries. (I think?) I will attach the results of the above query in json format.
Sample job runs:
Or see the test details report below.
Search.ci can be used to find job runs impacted. It doesn't always load, but if you wait 5-10 minutes and try again it usually will.
Original Report
Component Readiness has found a potential regression in the following test:
install should succeed: overall
Extreme regression detected.
Fishers Exact probability of a regression: 100.00%.
Test pass rate dropped from 98.81% to 69.93%.
Sample (being evaluated) Release: 4.22
Start Time: 2026-02-16T00:00:00Z
End Time: 2026-02-23T12:00:00Z
Success Rate: 69.93%
Successes: 200
Failures: 86
Flakes: 0
Base (historical) Release: 4.20
Start Time: 2025-09-21T00:00:00Z
End Time: 2025-10-21T00:00:00Z
Success Rate: 98.81%
Successes: 1000
Failures: 12
Flakes: 0
View the test details report for additional context.
Several other gcp install variant combos are showing problems, this is just the most visible one. See the triage record link that will be added to this card in a comment shortly for the full list of all regressions, but there should be more than enough job runs to investigate in this report linked above.
I think I see multiple causes here, unclear if there's one unifying underlying cause. I see problems with storage volume mounting, network operator awaiting one node, possibly more.
Consider this card a bug to get gcp install rates back up from their current 70% success rate.
Filed by: dgoodwin@redhat.com
From what I can see, almost all of these runs are also showing a failure for:
verify operator conditions network { Operator Progressing=True (Deploying): DaemonSet "/openshift-multus/multus-additional-cni-plugins" is not available (awaiting 2 nodes)}
Sometimes it's just one node. Yet all nodes seem to be fine.
The test seems to be showing regressed in other areas too, here's a metal regression that looks similar.
Global test analysis shows a sharp decline probably around the 19th.
Mostly hitting gcp but some metal in the mix as well.
- relates to
-
OCPBUGS-77244 runc jobs seem to be failing on bump to 1.4 (moving from 1.2.5 to 1.4)
-
- POST
-