[kni@cert-rhosp-02 ~]$ oc get csv NAME DISPLAY VERSION REPLACES PHASE node-healthcheck-operator.v0.10.0 Node Health Check Operator 0.10.0 node-healthcheck-operator.v0.9.1 Succeeded self-node-remediation.v0.10.0 Self Node Remediation Operator 0.10.0 self-node-remediation.v0.9.0 Succeeded [kni@cert-rhosp-02 ~]$ oc get node/worker-0-2 NAME STATUS ROLES AGE VERSION worker-0-2 Ready worker 34h v1.32.7 [kni@cert-rhosp-02 ~]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.19.0-0.nightly-2025-09-02-192040 True False 34h Cluster version is 4.19.0-0.nightly-2025-09-02-192040 [kni@cert-rhosp-02 ~]$ oc debug node/worker-0-2 -- chroot /host bash -c "uptime -s" Temporary namespace openshift-debug-s6wlb is created for debugging node... Starting pod/worker-0-2-debug-gg4th ... To use host binaries, run `chroot /host` 2025-09-04 19:26:32 Removing debug pod ... Temporary namespace openshift-debug-s6wlb was removed. [kni@cert-rhosp-02 ~]$ oc debug node/worker-0-2 -- chroot /host bash -c "systemctl stop kubelet" Temporary namespace openshift-debug-wqxqg is created for debugging node... Starting pod/worker-0-2-debug-cmlrs ... To use host binaries, run `chroot /host` Removing debug pod ... Temporary namespace openshift-debug-wqxqg was removed. error: unable to create the debug pod "worker-0-2-debug-cmlrs" apiVersion: machine.openshift.io/v1beta1 kind: MachineHealthCheck metadata: name: mhc-snr-worker namespace: openshift-machine-api spec: selector: matchLabels: machine.openshift.io/cluster-api-machine-role: "worker" machine.openshift.io/cluster-api-machine-type: "worker" remediationTemplate: kind: SelfNodeRemediationTemplate apiVersion: self-node-remediation.medik8s.io/v1alpha1 name: selfnoderemediationtemplate-sample namespace: openshift-machine-api unhealthyConditions: - type: Ready status: "False" timeout: 90s - type: Ready status: "Unknown" timeout: 90s - type: MemoryPressure status: "True" timeout: 90s - type: DiskPressure status: "True" timeout: 90s maxUnhealthy: 100% nodeStartupTimeout: 10m apiVersion: self-node-remediation.medik8s.io/v1alpha1 kind: SelfNodeRemediationTemplate metadata: namespace: openshift-machine-api name: selfnoderemediationtemplate-sample spec: template: spec: remediationStrategy: Automatic apiVersion: self-node-remediation.medik8s.io/v1alpha1 kind: SelfNodeRemediation metadata: annotations: machine.openshift.io/cloned-from-groupkind: SelfNodeRemediationTemplate.self-node-remediation.medik8s.io machine.openshift.io/cloned-from-name: selfnoderemediationtemplate-sample resourceVersion: '700097' name: ocp-edge-cluster-0-qgzbr-worker-0-8vng4 uid: bb3ff43b-b1e0-4d18-90ae-aa2c47104b7c creationTimestamp: '2025-09-04T19:59:12Z' generation: 1 managedFields: - apiVersion: self-node-remediation.medik8s.io/v1alpha1 fieldsType: FieldsV1 fieldsV1: 'f:metadata': 'f:finalizers': .: {} 'v:"self-node-remediation.medik8s.io/snr-finalizer"': {} manager: Go-http-client operation: Update time: '2025-09-04T19:59:12Z' - apiVersion: self-node-remediation.medik8s.io/v1alpha1 fieldsType: FieldsV1 fieldsV1: 'f:metadata': 'f:annotations': .: {} 'f:machine.openshift.io/cloned-from-groupkind': {} 'f:machine.openshift.io/cloned-from-name': {} 'f:ownerReferences': .: {} 'k:{"uid":"2e7962b4-d2b1-4b24-a372-b08e173bf31e"}': {} 'f:spec': .: {} 'f:remediationStrategy': {} manager: machine-healthcheck operation: Update time: '2025-09-04T19:59:12Z' - apiVersion: self-node-remediation.medik8s.io/v1alpha1 fieldsType: FieldsV1 fieldsV1: 'f:status': .: {} 'f:conditions': .: {} 'k:{"type":"Processing"}': .: {} 'f:lastTransitionTime': {} 'f:message': {} 'f:reason': {} 'f:status': {} 'f:type': {} 'k:{"type":"Succeeded"}': .: {} 'f:lastTransitionTime': {} 'f:message': {} 'f:reason': {} 'f:status': {} 'f:type': {} 'f:phase': {} 'f:timeAssumedRebooted': {} manager: Go-http-client operation: Update subresource: status time: '2025-09-04T19:59:13Z' namespace: openshift-machine-api ownerReferences: - apiVersion: machine.openshift.io/v1beta1 kind: Machine name: ocp-edge-cluster-0-qgzbr-worker-0-8vng4 uid: 2e7962b4-d2b1-4b24-a372-b08e173bf31e finalizers: - self-node-remediation.medik8s.io/snr-finalizer spec: remediationStrategy: Automatic status: conditions: - lastTransitionTime: '2025-09-04T19:59:12Z' message: '' reason: RemediationStarted status: 'True' type: Processing - lastTransitionTime: '2025-09-04T19:59:12Z' message: '' reason: RemediationStarted status: Unknown type: Succeeded phase: Pre-Reboot-Completed timeAssumedRebooted: '2025-09-04T20:01:13Z' 2025-09-04T19:27:58.189651438Z DEBUG events [remediation] Remediation finished {"type": "Normal", "object": {"kind":"SelfNodeRemediation","namespace":"openshift-machine-api","name":"worker-0-2-l2kw8","uid":"e7abf912-ecbf-4556-ba5b-b794b122d010","apiVersion":"self-node-remediation.medik8s.io/v1alpha1","resourceVersion":"690110"}, "reason": "RemediationFinished"} 2025-09-04T19:27:58.194962461Z INFO controllers.SelfNodeRemediation SNR already deleted {"pod": "manager", "selfnoderemediation": {"name":"worker-0-2-l2kw8","namespace":"openshift-machine-api"}} 2025-09-04T19:27:59.195902931Z INFO controllers.SelfNodeRemediation SNR already deleted {"pod": "manager", "selfnoderemediation": {"name":"worker-0-2-l2kw8","namespace":"openshift-machine-api"}} 2025-09-04T19:59:12.022038705Z INFO selfnoderemediation-resource validate create {"name": "ocp-edge-cluster-0-qgzbr-worker-0-8vng4"} 2025-09-04T19:59:12.127210443Z INFO controllers.SelfNodeRemediation Remediating with OutOfServiceTaint Remediation strategy (auto-selected) {"pod": "manager", "selfnoderemediation": {"name":"ocp-edge-cluster-0-qgzbr-worker-0-8vng4","namespace":"openshift-machine-api"}} 2025-09-04T19:59:12.127236204Z INFO controllers.SelfNodeRemediation pre-reboot not completed yet, prepare for rebooting {"pod": "manager", "selfnoderemediation": {"name":"ocp-edge-cluster-0-qgzbr-worker-0-8vng4","namespace":"openshift-machine-api"}} 2025-09-04T19:59:12.127241739Z DEBUG events [remediation] Remediation started by SNR manager {"type": "Normal", "object": {"kind":"SelfNodeRemediation","namespace":"openshift-machine-api","name":"ocp-edge-cluster-0-qgzbr-worker-0-8vng4","uid":"bb3ff43b-b1e0-4d18-90ae-aa2c47104b7c","apiVersion":"self-node-remediation.medik8s.io/v1alpha1","resourceVersion":"700053"}, "reason": "RemediationStarted"} 2025-09-04T19:59:12.132858955Z INFO selfnoderemediation-resource validate update {"name": "ocp-edge-cluster-0-qgzbr-worker-0-8vng4"} 2025-09-04T19:59:12.135258718Z INFO controllers.SelfNodeRemediation finalizer added {"pod": "manager", "selfnoderemediation": {"name":"ocp-edge-cluster-0-qgzbr-worker-0-8vng4","namespace":"openshift-machine-api"}} 2025-09-04T19:59:12.135363713Z DEBUG events [remediation] Remediation process - successful adding finalizer {"type": "Normal", "object": {"kind":"SelfNodeRemediation","namespace":"openshift-machine-api","name":"ocp-edge-cluster-0-qgzbr-worker-0-8vng4","uid":"bb3ff43b-b1e0-4d18-90ae-aa2c47104b7c","apiVersion":"self-node-remediation.medik8s.io/v1alpha1","resourceVersion":"700055"}, "reason": "AddFinalizer"} 2025-09-04T19:59:12.139996263Z INFO controllers.SelfNodeRemediation Remediating with OutOfServiceTaint Remediation strategy (auto-selected) {"pod": "manager", "selfnoderemediation": {"name":"ocp-edge-cluster-0-qgzbr-worker-0-8vng4","namespace":"openshift-machine-api"}} 2025-09-04T19:59:12.140008934Z INFO controllers.SelfNodeRemediation pre-reboot not completed yet, prepare for rebooting {"pod": "manager", "selfnoderemediation": {"name":"ocp-edge-cluster-0-qgzbr-worker-0-8vng4","namespace":"openshift-machine-api"}} 2025-09-04T19:59:12.148178578Z INFO controllers.SelfNodeRemediation NoExecute taint added {"pod": "manager", "selfnoderemediation": {"name":"ocp-edge-cluster-0-qgzbr-worker-0-8vng4","namespace":"openshift-machine-api"}, "new taints": [{"key":"node.kubernetes.io/unreachable","effect":"NoSchedule","timeAdded":"2025-09-04T19:57:41Z"},{"key":"node.kubernetes.io/unreachable","effect":"NoExecute","timeAdded":"2025-09-04T19:57:47Z"},{"key":"medik8s.io/remediation","value":"self-node-remediation","effect":"NoExecute","timeAdded":"2025-09-04T19:59:12Z"}]} 2025-09-04T19:59:12.148223887Z INFO controllers.SelfNodeRemediation Marking node as unschedulable {"pod": "manager", "selfnoderemediation": {"name":"ocp-edge-cluster-0-qgzbr-worker-0-8vng4","namespace":"openshift-machine-api"}, "node name": "worker-0-2"} 2025-09-04T19:59:12.148415659Z DEBUG events [remediation] Remediation process - NoExecute taint added to the unhealthy node {"type": "Normal", "object": {"kind":"Node","name":"worker-0-2","uid":"28702782-f67d-4bc9-9792-a97600872477","apiVersion":"v1","resourceVersion":"700058"}, "reason": "AddNoExecute"} 2025-09-04T19:59:12.16082349Z DEBUG events [remediation] Remediation process - unhealthy node marked as unschedulable {"type": "Normal", "object": {"kind":"Node","name":"worker-0-2","uid":"28702782-f67d-4bc9-9792-a97600872477","apiVersion":"v1","resourceVersion":"700060"}, "reason": "MarkUnschedulable"} 2025-09-04T19:59:12.169157864Z INFO controllers.SelfNodeRemediation Remediating with OutOfServiceTaint Remediation strategy (auto-selected) {"pod": "manager", "selfnoderemediation": {"name":"ocp-edge-cluster-0-qgzbr-worker-0-8vng4","namespace":"openshift-machine-api"}} 2025-09-04T19:59:12.169180036Z INFO controllers.SelfNodeRemediation pre-reboot not completed yet, prepare for rebooting {"pod": "manager", "selfnoderemediation": {"name":"ocp-edge-cluster-0-qgzbr-worker-0-8vng4","namespace":"openshift-machine-api"}} 2025-09-04T19:59:12.169406208Z INFO controllers.SelfNodeRemediation waiting for unschedulable taint to appear {"pod": "manager", "selfnoderemediation": {"name":"ocp-edge-cluster-0-qgzbr-worker-0-8vng4","namespace":"openshift-machine-api"}, "node name": "worker-0-2"} 2025-09-04T19:59:12.181614317Z INFO controllers.SelfNodeRemediation Remediating with OutOfServiceTaint Remediation strategy (auto-selected) {"pod": "manager", "selfnoderemediation": {"name":"ocp-edge-cluster-0-qgzbr-worker-0-8vng4","namespace":"openshift-machine-api"}} 2025-09-04T19:59:12.181646924Z INFO controllers.SelfNodeRemediation pre-reboot not completed yet, prepare for rebooting {"pod": "manager", "selfnoderemediation": {"name":"ocp-edge-cluster-0-qgzbr-worker-0-8vng4","namespace":"openshift-machine-api"}} 2025-09-04T19:59:12.182083201Z INFO controllers.SelfNodeRemediation waiting for unschedulable taint to appear {"pod": "manager", "selfnoderemediation": {"name":"ocp-edge-cluster-0-qgzbr-worker-0-8vng4","namespace":"openshift-machine-api"}, "node name": "worker-0-2"} 2025-09-04T19:59:13.169529602Z INFO controllers.SelfNodeRemediation Remediating with OutOfServiceTaint Remediation strategy (auto-selected) {"pod": "manager", "selfnoderemediation": {"name":"ocp-edge-cluster-0-qgzbr-worker-0-8vng4","namespace":"openshift-machine-api"}} 2025-09-04T19:59:13.169547444Z INFO controllers.SelfNodeRemediation pre-reboot not completed yet, prepare for rebooting {"pod": "manager", "selfnoderemediation": {"name":"ocp-edge-cluster-0-qgzbr-worker-0-8vng4","namespace":"openshift-machine-api"}} 2025-09-04T19:59:13.169815959Z INFO rebootDurationCalculator No SafeTimeToAssumeNodeRebootedSeconds specified, using calculated minimum safe reboot time {"calculated minimum time in seconds": 120} 2025-09-04T19:59:13.169826964Z INFO controllers.SelfNodeRemediation setting SNR's time to assume node has been rebooted {"pod": "manager", "selfnoderemediation": {"name":"ocp-edge-cluster-0-qgzbr-worker-0-8vng4","namespace":"openshift-machine-api"}, "node name": "worker-0-2", "time": "2025-09-04 20:01:13.169826462 +0000 UTC m=+7157.450359348"} 2025-09-04T19:59:13.169941211Z DEBUG events [remediation] Remediation process - about to update required fencing time on snr {"type": "Normal", "object": {"kind":"SelfNodeRemediation","namespace":"openshift-machine-api","name":"ocp-edge-cluster-0-qgzbr-worker-0-8vng4","uid":"bb3ff43b-b1e0-4d18-90ae-aa2c47104b7c","apiVersion":"self-node-remediation.medik8s.io/v1alpha1","resourceVersion":"700065"}, "reason": "UpdateTimeAssumedRebooted"} 2025-09-04T19:59:13.174418927Z INFO controllers.SelfNodeRemediation Remediating with OutOfServiceTaint Remediation strategy (auto-selected) {"pod": "manager", "selfnoderemediation": {"name":"ocp-edge-cluster-0-qgzbr-worker-0-8vng4","namespace":"openshift-machine-api"}} 2025-09-04T19:59:13.174448705Z INFO controllers.SelfNodeRemediation Node didn't reboot yet, waiting for it to reboot {"pod": "manager", "selfnoderemediation": {"name":"ocp-edge-cluster-0-qgzbr-worker-0-8vng4","namespace":"openshift-machine-api"}, "node name": "worker-0-2", "time left": "2m0.825552207s"} 2025-09-04T20:00:59.692678378Z INFO controllers.SelfNodeRemediation Remediating with OutOfServiceTaint Remediation strategy (auto-selected) {"pod": "manager", "selfnoderemediation": {"name":"ocp-edge-cluster-0-qgzbr-worker-0-8vng4","namespace":"openshift-machine-api"}} 2025-09-04T20:00:59.692700774Z INFO controllers.SelfNodeRemediation Node didn't reboot yet, waiting for it to reboot {"pod": "manager", "selfnoderemediation": {"name":"ocp-edge-cluster-0-qgzbr-worker-0-8vng4","namespace":"openshift-machine-api"}, "node name": "worker-0-2", "time left": "14.307300248s"} 2025-09-04T20:01:14.004664801Z INFO controllers.SelfNodeRemediation Remediating with OutOfServiceTaint Remediation strategy (auto-selected) {"pod": "manager", "selfnoderemediation": {"name":"ocp-edge-cluster-0-qgzbr-worker-0-8vng4","namespace":"openshift-machine-api"}} 2025-09-04T20:01:14.004791454Z INFO controllers.SelfNodeRemediation TimeAssumedRebooted is old. The unhealthy node assumed to been rebooted {"pod": "manager", "selfnoderemediation": {"name":"ocp-edge-cluster-0-qgzbr-worker-0-8vng4","namespace":"openshift-machine-api"}, "node name": "worker-0-2"} 2025-09-04T20:01:14.012486191Z INFO controllers.SelfNodeRemediation Remediating with OutOfServiceTaint Remediation strategy (auto-selected) {"pod": "manager", "selfnoderemediation": {"name":"ocp-edge-cluster-0-qgzbr-worker-0-8vng4","namespace":"openshift-machine-api"}} 2025-09-04T20:01:14.022759683Z INFO controllers.SelfNodeRemediation out-of-service taint added {"pod": "manager", "selfnoderemediation": {"name":"ocp-edge-cluster-0-qgzbr-worker-0-8vng4","namespace":"openshift-machine-api"}, "new taints": [{"key":"medik8s.io/remediation","value":"self-node-remediation","effect":"NoExecute","timeAdded":"2025-09-04T19:59:12Z"},{"key":"node.kubernetes.io/unschedulable","effect":"NoSchedule","timeAdded":"2025-09-04T19:59:12Z"},{"key":"node.kubernetes.io/out-of-service","value":"nodeshutdown","effect":"NoExecute","timeAdded":"2025-09-04T20:01:14Z"}]} 2025-09-04T20:01:14.02297146Z DEBUG events [remediation] Remediation process - add out-of-service taint to unhealthy node {"type": "Normal", "object": {"kind":"Node","name":"worker-0-2","uid":"28702782-f67d-4bc9-9792-a97600872477","apiVersion":"v1","resourceVersion":"700976"}, "reason": "AddOutOfService"} 2025-09-04T20:01:14.04030999Z INFO controllers.SelfNodeRemediation out-of-service taint removed {"pod": "manager", "selfnoderemediation": {"name":"ocp-edge-cluster-0-qgzbr-worker-0-8vng4","namespace":"openshift-machine-api"}, "new taints": [{"key":"medik8s.io/remediation","value":"self-node-remediation","effect":"NoExecute","timeAdded":"2025-09-04T19:59:12Z"},{"key":"node.kubernetes.io/unschedulable","effect":"NoSchedule","timeAdded":"2025-09-04T19:59:12Z"}]} 2025-09-04T20:01:14.040411818Z DEBUG events [remediation] Remediation process - remove out-of-service taint from node {"type": "Normal", "object": {"kind":"Node","name":"worker-0-2","uid":"28702782-f67d-4bc9-9792-a97600872477","apiVersion":"v1","resourceVersion":"700985"}, "reason": "RemoveOutOfService"} 2025-09-04T20:01:14.040460552Z DEBUG events [remediation] Remediation process - finished deleting unhealthy node resources {"type": "Normal", "object": {"kind":"Node","name":"worker-0-2","uid":"28702782-f67d-4bc9-9792-a97600872477","apiVersion":"v1","resourceVersion":"700985"}, "reason": "DeleteResources"} 2025-09-04T20:01:14.045754368Z INFO controllers.SelfNodeRemediation Remediating with OutOfServiceTaint Remediation strategy (auto-selected) {"pod": "manager", "selfnoderemediation": {"name":"ocp-edge-cluster-0-qgzbr-worker-0-8vng4","namespace":"openshift-machine-api"}} 2025-09-04T20:01:14.045777048Z INFO controllers.SelfNodeRemediation fencing completed, cleaning up {"pod": "manager", "selfnoderemediation": {"name":"ocp-edge-cluster-0-qgzbr-worker-0-8vng4","namespace":"openshift-machine-api"}} 2025-09-04T20:01:14.052656726Z DEBUG events [remediation] Remediation process - mark healthy remediated node as schedulable {"type": "Normal", "object": {"kind":"Node","name":"worker-0-2","uid":"28702782-f67d-4bc9-9792-a97600872477","apiVersion":"v1","resourceVersion":"700988"}, "reason": "MarkNodeSchedulable"} 2025-09-04T20:01:15.062157139Z INFO controllers.SelfNodeRemediation Remediating with OutOfServiceTaint Remediation strategy (auto-selected) {"pod": "manager", "selfnoderemediation": {"name":"ocp-edge-cluster-0-qgzbr-worker-0-8vng4","namespace":"openshift-machine-api"}} 2025-09-04T20:01:15.062252235Z INFO controllers.SelfNodeRemediation fencing completed, cleaning up {"pod": "manager", "selfnoderemediation": {"name":"ocp-edge-cluster-0-qgzbr-worker-0-8vng4","namespace":"openshift-machine-api"}} 2025-09-04T20:01:15.070401224Z INFO controllers.SelfNodeRemediation NoExecute taint removed {"pod": "manager", "selfnoderemediation": {"name":"ocp-edge-cluster-0-qgzbr-worker-0-8vng4","namespace":"openshift-machine-api"}, "new taints": null} 2025-09-04T20:01:15.070844054Z DEBUG events [remediation] Remediation process - remove NoExecute taint from healthy remediated node {"type": "Normal", "object": {"kind":"Node","name":"worker-0-2","uid":"28702782-f67d-4bc9-9792-a97600872477","apiVersion":"v1","resourceVersion":"701024"}, "reason": "RemoveNoExecuteTaint"} 2025-09-04T20:01:15.076677645Z INFO selfnoderemediation-resource validate update {"name": "ocp-edge-cluster-0-qgzbr-worker-0-8vng4"} 2025-09-04T20:01:15.081817628Z INFO controllers.SelfNodeRemediation finalizer removed {"pod": "manager", "selfnoderemediation": {"name":"ocp-edge-cluster-0-qgzbr-worker-0-8vng4","namespace":"openshift-machine-api"}} 2025-09-04T20:01:15.082170229Z DEBUG events [remediation] Remediation process - remove finalizer from snr {"type": "Normal", "object": {"kind":"SelfNodeRemediation","namespace":"openshift-machine-api","name":"ocp-edge-cluster-0-qgzbr-worker-0-8vng4","uid":"bb3ff43b-b1e0-4d18-90ae-aa2c47104b7c","apiVersion":"self-node-remediation.medik8s.io/v1alpha1","resourceVersion":"700986"}, "reason": "RemoveFinalizer"} 2025-09-04T20:01:15.082198571Z DEBUG events [remediation] Remediation finished {"type": "Normal", "object": {"kind":"SelfNodeRemediation","namespace":"openshift-machine-api","name":"ocp-edge-cluster-0-qgzbr-worker-0-8vng4","uid":"bb3ff43b-b1e0-4d18-90ae-aa2c47104b7c","apiVersion":"self-node-remediation.medik8s.io/v1alpha1","resourceVersion":"700986"}, "reason": "RemediationFinished"} 2025-09-04T20:01:15.084791327Z INFO controllers.SelfNodeRemediation SNR already deleted {"pod": "manager", "selfnoderemediation": {"name":"ocp-edge-cluster-0-qgzbr-worker-0-8vng4","namespace":"openshift-machine-api"}} 2025-09-04T20:01:16.085519635Z INFO controllers.SelfNodeRemediation SNR already deleted {"pod": "manager", "selfnoderemediation": {"name":"ocp-edge-cluster-0-qgzbr-worker-0-8vng4","namespace":"openshift-machine-api"}}