-
Bug
-
Resolution: Duplicate
-
Major
-
None
-
4.12.z
-
No
-
Rejected
-
False
-
Description of problem:
The infra MCP is degraded due to one of the infra node unable to upgrade due to below issue:
2023-07-20T05:06:55.045058094Z I0720 05:06:55.045011 2786 update.go:2118] Disk currentConfig rendered-infra-c6d6928bfcd10ab1b440f6a2505bd5d1 overrides node's currentConfig annotation rendered-infra-76583762333a6685c3d4d1b75e14c28b 2023-07-20T05:06:55.048306566Z I0720 05:06:55.048269 2786 daemon.go:1564] Validating against pending config rendered-infra-c6d6928bfcd10ab1b440f6a2505bd5d1 2023-07-20T05:06:57.733681234Z E0720 05:06:57.733641 2786 writer.go:200] Marking Degraded due to: unexpected on-disk state validating against rendered-infra-c6d6928bfcd10ab1b440f6a2505bd5d1: expected target osImageURL "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:5ef4276442c5174d31f6b62a83aa40e64c719275dd731e5ccb0dc98911f7e57e", have "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:fb065c8d91453ce4a3f5518189b34bce94406c01f43957abde01f08165b3a085" ("1ad911e70b7befaad4f3eac5ee14510bbaaecbedb9fb464ffbe3cb38e133576f")
Below are ostree-finalize-staged.service logs, we can see that there is a timeout after 20 minutes of copying:
journalctl_--no-pager_--unit_ostree-finalize-staged Jul 19 15:22:23 SOINR01CAL0101.raiffeisen.org ostree[372060]: Copying /etc changes: 19 modified, 0 removed, 212 added Jul 19 15:42:21 SOINR01CAL0101.raiffeisen.org systemd[1]: ostree-finalize-staged.service: Stopping timed out. Terminating.
The ostree-finalize-staged.service timeout is already set to 20 min in the RHCOS node.`
$ cat etc/systemd/system/ostree-finalize-staged.service.d/override.conf [Service] TimeoutStopSec=20m
$ cat rpm-ostree_status_-v State: idle Warning: failed to finalize previous deployment check `journalctl -b -1 -u ostree-finalize-staged.service` AutomaticUpdates: disabled Deployments: ● ostree-unverified-registry:quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:fb065c8d91453ce4a3f5518189b34bce94406c01f43957abde01f08165b3a085 (index: 0) Digest: sha256:fb065c8d91453ce4a3f5518189b34bce94406c01f43957abde01f08165b3a085 Version: 412.86.202306271602-0 (2023-07-14T15:33:47Z) Commit: 1ad911e70b7befaad4f3eac5ee14510bbaaecbedb9fb464ffbe3cb38e133576f Staged: no StateRoot: rhcos
Additional info:
Everytime when a minor upgrade is triggered for example from 4.12.20 to 4.12.21, 4.12.21 to 4.12.22 and 4.12.23 to 4.12.24. Only the infra nodes getting into the degraded state. A simple MCP upgrade, like an update on a machine config for NTP, does not bring the node to a degraded state.