-
Bug
-
Resolution: Done-Errata
-
Normal
-
4.14.0
-
Quality / Stability / Reliability
-
False
-
-
3
-
Moderate
-
No
-
None
-
MCO Sprint 251, MCO Sprint 252, MCO Sprint 255, MCO Sprint 256, MCO Sprint 257
-
5
-
None
-
None
-
-
None
-
None
-
None
-
None
Description of problem:
When we activate the on-cluster-build functionality in a pool with yum based RHEL nodes, the pool is degraded reporting this error:
- lastTransitionTime: "2023-09-20T15:14:44Z"
message: 'Node ip-10-0-57-169.us-east-2.compute.internal is reporting: "error
running rpm-ostree --version: exec: \"rpm-ostree\": executable file not found
in $PATH"'
reason: 1 nodes are reporting degraded status on sync
status: "True"
type: NodeDegraded
Version-Release number of selected component (if applicable):
4.14.0-0.nightly-2023-09-15-233408
How reproducible:
Always
Steps to Reproduce:
1. Create a cluster and add a yum based RHEL node to the worker pool
(we used RHEL8)
2. Create the necessary resources to enable the OCB functionality. Pull and push secrets and the on-cluster-build-config configmap.
For example we can use this if we want to use the internal registry:
cat << EOF | oc create -f -
apiVersion: v1
data:
baseImagePullSecretName: $(oc get secret -n openshift-config pull-secret -o json | jq "del(.metadata.namespace, .metadata.creationTimestamp, .metadata.resourceVersion, .metadata.uid, .metadata.name)" | jq '.metadata.name="pull-copy"' | oc -n openshift-machine-config-operator create -f - &> /dev/null; echo -n "pull-copy")
finalImagePushSecretName: $(oc get -n openshift-machine-config-operator sa builder -ojsonpath='{.secrets[0].name}')
finalImagePullspec: "image-registry.openshift-image-registry.svc:5000/openshift-machine-config-operator/ocb-image"
imageBuilderType: ""
kind: ConfigMap
metadata:
name: on-cluster-build-config
namespace: openshift-machine-config-operator
EOF
The configuration doesn't matter as long as the OCB functionality can work.
3. Label the worker pool so that the OCB functionality is enabled
$ oc label mcp/worker machineconfiguration.openshift.io/layering-enabled=
Actual results:
The RHEL node shows this log:
I0920 15:14:42.852742 1979 daemon.go:760] Preflight config drift check successful (took 17.527225ms)
I0920 15:14:42.852763 1979 daemon.go:2150] Performing layered OS update
I0920 15:14:42.868723 1979 update.go:1970] Starting transition to "image-registry.openshift-image-registry.svc:5000/openshift-machine-config-operator/tc-67566@sha256:24ea4b12acf93095732ba457fc3e8c7f1287b669f2aceec65a33a41f7e8ceb01"
I0920 15:14:42.871625 1979 update.go:1970] drain is already completed on this node
I0920 15:14:42.874305 1979 rpm-ostree.go:307] Running captured: rpm-ostree --version
E0920 15:14:42.874388 1979 writer.go:226] Marking Degraded due to: error running rpm-ostree --version: exec: "rpm-ostree": executable file not found in $PATH
I0920 15:15:37.570503 1979 daemon.go:670] Transitioned from state: Working -> Degraded
I0920 15:15:37.570529 1979 daemon.go:673] Transitioned from degraded/unreconcilable reason -> error running rpm-ostree --version: exec: "rpm-ostree": executable file not found in $PATH
I0920 15:15:37.574942 1979 daemon.go:2300] Not booted into a CoreOS variant, ignoring target OSImageURL quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e3128a8e42fb70ab6fc276f7005e3c0839795e4455823c8ff3eca9b1050798b9
I0920 15:15:37.591529 1979 daemon.go:760] Preflight config drift check successful (took 16.588912ms)
I0920 15:15:37.591549 1979 daemon.go:2150] Performing layered OS update
I0920 15:15:37.591562 1979 update.go:1970] Starting transition to "image-registry.openshift-image-registry.svc:5000/openshift-machine-config-operator/tc-67566@sha256:24ea4b12acf93095732ba457fc3e8c7f1287b669f2aceec65a33a41f7e8ceb01"
I0920 15:15:37.594534 1979 update.go:1970] drain is already completed on this node
I0920 15:15:37.597261 1979 rpm-ostree.go:307] Running captured: rpm-ostree --version
E0920 15:15:37.597315 1979 writer.go:226] Marking Degraded due to: error running rpm-ostree --version: exec: "rpm-ostree": executable file not found in $PATH
qI0920 15:16:37.613270 1979 daemon.go:2300] Not booted into a CoreOS variant, ignoring target OSImageURL quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e3128a8e42fb70ab6fc276f7005e3c0839795e4455823c8ff3eca9b1050798b9
And the worker pool is degraded with this error:
- lastTransitionTime: "2023-09-20T15:14:44Z"
message: 'Node ip-10-0-57-169.us-east-2.compute.internal is reporting: "error
running rpm-ostree --version: exec: \"rpm-ostree\": executable file not found
in $PATH"'
reason: 1 nodes are reporting degraded status on sync
status: "True"
type: NodeDegraded
Expected results:
The pool should not be degraded.
Additional info:
- links to
-
RHEA-2024:3718
OpenShift Container Platform 4.17.z bug fix update