-
Bug
-
Resolution: Done
-
Normal
-
None
-
4.14.0
-
Important
-
No
-
MCO Sprint 250, MCO Sprint 251, MCO Sprint 252
-
3
-
Rejected
-
False
-
Description of problem:
In MCPs that are using on-cluster builds, when we create a MC and before the resulting rendered configuration is built (the build pod is still running) we create another new MC, then the machine-os-builder pod is stuck reporting this error: {noformat} I0830 15:17:38.094296 1 image_build_controller.go:129] Started syncing pod openshift-machine-config-operator/build-rendered-worker-c8522984c224de584cbca2a95d584066 I0830 15:17:38.113944 1 build_controller.go:370] Build (build-rendered-worker-c8522984c224de584cbca2a95d584066) is Complete I0830 15:17:38.113974 1 build_controller.go:652] Build succeeded for MachineConfigPool worker, config rendered-worker-516e5f95aeb41fce8231b00878350eaa I0830 15:17:38.216381 1 request.go:628] Waited for 102.315143ms due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/build.openshift.io/v1/namespaces/openshift-machine-config-operator/builds/build-rendered-worker-516e5f95aeb41fce8231b00878350eaa I0830 15:17:38.223007 1 image_build_controller.go:127] Finished syncing pod openshift-machine-config-operator/build-rendered-worker-c8522984c224de584cbca2a95d584066: 128.712599ms E0830 15:17:38.223038 1 image_build_controller.go:305] unable to update with build status: could not get final image pullspec for pool worker: could not get build build-rendered-worker-516e5f95aeb41fce8231b00878350eaa for pool worker: builds.build.openshift.io "build-rendered-worker-516e5f95aeb41fce8231b00878350eaa" not found I0830 15:17:38.223047 1 image_build_controller.go:306] Dropping build "openshift-machine-config-operator/build-rendered-worker-c8522984c224de584cbca2a95d584066" out of the queue: unable to update with build status: could not get final image pullspec for pool worker: could not get build build-rendered-worker-516e5f95aeb41fce8231b00878350eaa for pool worker: builds.build.openshift.io "build-rendered-worker-516e5f95aeb41fce8231b00878350eaa" not found {noformat}
Version-Release number of selected component (if applicable):
4.14
How reproducible:
Always
Steps to Reproduce:
We reproduce it by executing the automated test case "[sig-mco] MCO Author:sregidor-NonHyperShiftHOST-NonPreRelease-Longduration-Medium-63477-Deploy files using all available ignition configs. Default 3.4.0[Disruptive] [Serial]" But I guess that we can reproduce it manually by 1. Enable on-cluster build in worker pool $ oc label mcp/worker machineconfiguration.openshift.io/layering-enabled= 2. Create a MC to deploy a new test file piVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: labels: machineconfiguration.openshift.io/role: worker name: mc-tc-63477-2-2-0-svgix23t spec: config: ignition: version: 2.2.0 passwd: users: [] storage: files: - contents: source: data:text/plain;charset=utf-8;base64,Mi4yLjAgdGVzdCBmaWxl filesystem: root mode: 420 path: /etc/2-2-0.test 3. After 3 or 5 secods create a new MC to deploy another file apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: labels: machineconfiguration.openshift.io/role: worker name: mc-tc-63477-3-0-0-17oe4k8a spec: config: ignition: version: 3.0.0 passwd: users: [] storage: files: - contents: source: data:text/plain;charset=utf-8;base64,My4wLjAgdGVzdCBmaWxl mode: 420 path: /etc/3-0-0.test 4. Continue creating new MCs until you see this error in the machine-os-builder pod I0830 15:17:38.223047 1 image_build_controller.go:306] Dropping build "openshift-machine-config-operator/build-rendered-worker-c8522984c224de584cbca2a95d584066" out of the queue: unable to update with build status: could not get final image pullspec for pool worker: could not get build build-rendered-worker-516e5f95aeb41fce8231b00878350eaa for pool worker: builds.build.openshift.io "build-rendered-worker-516e5f95aeb41fce8231b00878350eaa" not found
Actual results:
The machine-os-builder pod reports an error and it gets stuck.
Expected results:
The machine-os-builder pod should not be stuck in any error.
Additional info:
A link to the must gather file is provided in the first comment of this issue.
- is blocked by
-
MCO-1165 [Regression] BuildController should have a rebuild function
- Code Review