Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Major
Fix Version/s: None
Affects Version/s: 4.19
Component/s: Installer / Assisted installer
Labels:

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
None
Regression:
None
Architecture:

x86_64

Target Backport Versions:
None
Target Version:
None
Release Blocker:
None
Sprint:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Review Complete:
PX Impact Score:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

We have a SNO Hub cluster deployed using Assisted Installer.  On this cluster, we installed the MCE operator and enabled CAPI/M3. As part of this, we were instructed the image used for deploying spoke clusters must include the cloud-init to support a config drive so we grabbed the RHCOS Openstack image to use as the base image.  The deployment worked fine and we successfully deployed a 4.19.10.  Post deployment, we attempted to upgrade from 4.19.10 and the deployment failed with the machine-config operator failing with the error message:

Unable to apply 4.19.13: error during syncRequiredMachineConfigPools: [context deadline exceeded, error MachineConfigPool master is not ready, retrying. Status: (pool degraded: true total: 3, ready 0, updated: 0, unavailable: 1, reason: Node node1.example.com is reporting: "unexpected on-disk state validating against rendered-master-7342b6e6f4da65354b75fb4695d9c0e0: expected target osImageURL \"quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:cdc2f0b00851e31b0f34ea7601b8550ad143c1f21aab6fd70b6082cf5f4076fe\", have \"quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:deb731c3ffed587df534e53a83420ac00540677259d79d84d66c2b1c4422041b\"; possible root cause: error: Installing kernel: regfile copy: No space left on device")]

Inspecting the system showed the /boot/ostree had three directories in it, 2 install-* and one rhcos-* directory.

We remounted /boot as read/write and deleted /boot/ostree/rhcos-*, touched /run/machine-config-daemon-force and the node rebooted and the upgrade proceeded. We had to do this on every baremetal node in the spoke cluster.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:

    1. Deploy Spoke cluster using our documentation:
https://docs.redhat.com/en/documentation/openshift_container_platform/4.19/html/machine_management/managing-machines-with-the-cluster-api
    2. Upgrade Spoke Cluster
    3.

Actual results:

Expected results:

Additional info:

links to

https://docs.google.com/document/d/1nxLv77YOpfoP_kILJ2rjoVWQzNepIiELrbc1QzDhfZ0/edit?tab=t.0#heading=h.xjvs1bwazipz

Assignee:: Crystal Chun

Reporter:: Darin Sorrentino

QA Contact:: Jad Haj Yahya

Need Info From:: None

Votes:: 1 Vote for this issue

Watchers:: 15 Start watching this issue

Created:: 2025/10/17 10:16 PM

Updated:: 2026/02/27 7:30 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates