-
Bug
-
Resolution: Done
-
Normal
-
None
-
4.14.0
-
Quality / Stability / Reliability
-
False
-
-
None
-
Important
-
No
-
None
-
Rejected
-
MCO Sprint 250, MCO Sprint 251, MCO Sprint 252
-
3
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
In MCPs that are using on-cluster builds, when we create a MC and before the resulting rendered configuration is built (the build pod is still running) we create another new MC, then the machine-os-builder pod is stuck reporting this error:
{noformat}
I0830 15:17:38.094296 1 image_build_controller.go:129] Started syncing pod openshift-machine-config-operator/build-rendered-worker-c8522984c224de584cbca2a95d584066
I0830 15:17:38.113944 1 build_controller.go:370] Build (build-rendered-worker-c8522984c224de584cbca2a95d584066) is Complete
I0830 15:17:38.113974 1 build_controller.go:652] Build succeeded for MachineConfigPool worker, config rendered-worker-516e5f95aeb41fce8231b00878350eaa
I0830 15:17:38.216381 1 request.go:628] Waited for 102.315143ms due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/build.openshift.io/v1/namespaces/openshift-machine-config-operator/builds/build-rendered-worker-516e5f95aeb41fce8231b00878350eaa
I0830 15:17:38.223007 1 image_build_controller.go:127] Finished syncing pod openshift-machine-config-operator/build-rendered-worker-c8522984c224de584cbca2a95d584066: 128.712599ms
E0830 15:17:38.223038 1 image_build_controller.go:305] unable to update with build status: could not get final image pullspec for pool worker: could not get build build-rendered-worker-516e5f95aeb41fce8231b00878350eaa for pool worker: builds.build.openshift.io "build-rendered-worker-516e5f95aeb41fce8231b00878350eaa" not found
I0830 15:17:38.223047 1 image_build_controller.go:306] Dropping build "openshift-machine-config-operator/build-rendered-worker-c8522984c224de584cbca2a95d584066" out of the queue: unable to update with build status: could not get final image pullspec for pool worker: could not get build build-rendered-worker-516e5f95aeb41fce8231b00878350eaa for pool worker: builds.build.openshift.io "build-rendered-worker-516e5f95aeb41fce8231b00878350eaa" not found
{noformat}
Version-Release number of selected component (if applicable):
4.14
How reproducible:
Always
Steps to Reproduce:
We reproduce it by executing the automated test case
"[sig-mco] MCO Author:sregidor-NonHyperShiftHOST-NonPreRelease-Longduration-Medium-63477-Deploy files using all available ignition configs. Default 3.4.0[Disruptive] [Serial]"
But I guess that we can reproduce it manually by
1. Enable on-cluster build in worker pool
$ oc label mcp/worker machineconfiguration.openshift.io/layering-enabled=
2. Create a MC to deploy a new test file
piVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
labels:
machineconfiguration.openshift.io/role: worker
name: mc-tc-63477-2-2-0-svgix23t
spec:
config:
ignition:
version: 2.2.0
passwd:
users: []
storage:
files:
- contents:
source: data:text/plain;charset=utf-8;base64,Mi4yLjAgdGVzdCBmaWxl
filesystem: root
mode: 420
path: /etc/2-2-0.test
3. After 3 or 5 secods create a new MC to deploy another file
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
labels:
machineconfiguration.openshift.io/role: worker
name: mc-tc-63477-3-0-0-17oe4k8a
spec:
config:
ignition:
version: 3.0.0
passwd:
users: []
storage:
files:
- contents:
source: data:text/plain;charset=utf-8;base64,My4wLjAgdGVzdCBmaWxl
mode: 420
path: /etc/3-0-0.test
4. Continue creating new MCs until you see this error in the machine-os-builder pod
I0830 15:17:38.223047 1 image_build_controller.go:306] Dropping build "openshift-machine-config-operator/build-rendered-worker-c8522984c224de584cbca2a95d584066" out of the queue: unable to update with build status: could not get final image pullspec for pool worker: could not get build build-rendered-worker-516e5f95aeb41fce8231b00878350eaa for pool worker: builds.build.openshift.io "build-rendered-worker-516e5f95aeb41fce8231b00878350eaa" not found
Actual results:
The machine-os-builder pod reports an error and it gets stuck.
Expected results:
The machine-os-builder pod should not be stuck in any error.
Additional info:
A link to the must gather file is provided in the first comment of this issue.
- is blocked by
-
MCO-1165 [Regression] BuildController should have a rebuild function
-
- Code Review
-