-
Bug
-
Resolution: Unresolved
-
Normal
-
4.17
-
Quality / Stability / Reliability
-
False
-
-
3
-
Important
-
None
-
None
-
None
-
None
-
In Progress
-
Release Note Not Required
-
None
-
None
-
None
-
None
-
None
Description
While updating the node, more specifically while updating the system-units the machine-config-daemon is receiving a SIGTERM, remove the SIGTERM protection, and kill itself.
After the node has rebooted, the system-units are still disabled leading to some inconsistency.
Environment
RHOCP 4.17.33, Single node cluster + 3 workers
Customer is using a custom image (osImageURL).
logs
Based on the current information shared by the customer, the issue is happening randomly on different cluster while the machine-config-daemon is updating the node.
machine-config-daemon log
2025-06-18T20:53:30.623044384+00:00 stderr F I0618 20:53:30.622990 160840 file_writers.go:294] Writing systemd unit "restart-host.timer" 2025-06-18T20:53:30.745661116+00:00 stderr F I0618 20:53:30.745619 160840 file_writers.go:307] Disabling systemd unit restart-host.timer before re-writing it 2025-06-18T20:53:34.230978037+00:00 stderr F I0618 20:53:34.230670 160840 file_writers.go:294] Writing systemd unit "tpm-lockout.service" 2025-06-18T20:53:34.298968775+00:00 stderr F I0618 20:53:34.298886 160840 file_writers.go:307] Disabling systemd unit tpm-lockout.service before re-writing it 2025-06-18T20:53:37.538554080+00:00 stderr F I0618 20:53:37.538502 160840 file_writers.go:294] Writing systemd unit "enable-usbguard.service" 2025-06-18T20:53:37.776100348+00:00 stderr F I0618 20:53:37.776046 160840 file_writers.go:307] Disabling systemd unit enable-usbguard.service before re-writing it 2025-06-18T20:53:38.370326227+00:00 stderr F I0618 20:53:38.370262 160840 daemon.go:1323] Got SIGTERM, but actively updating 2025-06-18T20:53:38.414833961+00:00 stderr F I0618 20:53:38.414769 160840 update.go:2689] Removing SIGTERM protection 2025-06-18T20:53:38.414833961+00:00 stderr F E0618 20:53:38.414821 160840 writer.go:226] Marking Degraded due to: "daemon could not write systemd unit: disabling enable-usbguard.service failed: signal: terminated (output: )" 2025-06-18T20:53:39.640953621+00:00 stderr F W0618 20:53:39.640907 160840 daemon.go:1398] Got an error from auxiliary tools: kubelet health check has failed 1 times: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused