[kni@cert-rhosp-02 ~]$ oc get nodes/worker-0-2 -o json | jq .spec.taints null [kni@cert-rhosp-02 ~]$ oc get nodes -l 'node-role.kubernetes.io/worker' NAME STATUS ROLES AGE VERSION worker-0-0 Ready worker 11h v1.33.3 worker-0-1 Ready worker 11h v1.33.3 worker-0-2 Ready worker 11h v1.33.3 [kni@cert-rhosp-02 ~]$ oc debug node/worker-0-2 -- chroot /host bash -c "uptime -s" Temporary namespace openshift-debug-qjbsm is created for debugging node... Starting pod/worker-0-2-debug-f5qzh ... To use host binaries, run `chroot /host` 2025-09-05 19:28:00 Removing debug pod ... Temporary namespace openshift-debug-qjbsm was removed. [kni@cert-rhosp-02 ~]$ oc get far -o yaml apiVersion: v1 items: [] kind: List metadata: resourceVersion: "" [kni@cert-rhosp-02 ~]$ oc get fartemplate -o yaml apiVersion: v1 items: [] kind: List metadata: resourceVersion: "" [kni@cert-rhosp-02 ~]$ oc get nhc -o yaml apiVersion: v1 items: [] kind: List metadata: resourceVersion: "" [kni@cert-rhosp-02 ~]$ cat test.yaml apiVersion: fence-agents-remediation.medik8s.io/v1alpha1 kind: FenceAgentsRemediation metadata: name: worker-0-2 namespace: openshift-workload-availability spec: agent: fence_ipmilan retrycount: 5 retryinterval: 10s timeout: 300s nodeparameters: '--ipport': master-0-0: '6230' master-0-1: '6231' master-0-2: '6232' worker-0-0: '6233' worker-0-1: '6234' worker-0-2: '6235' sharedparameters: '--action': reboot '--ip': 192.168.123.1 '--lanplus': '' '--username': admin nodeSecretNames: worker-0-0: worker-0-cred worker-0-1: worker-1-cred worker-0-2: worker-2-cred [kni@cert-rhosp-02 ~]$ oc get secret | grep test-far test-far-shared Opaque 2 140m [kni@cert-rhosp-02 ~]$ oc debug node/worker-0-2 -- chroot /host bash -c "systemctl stop kubelet" Temporary namespace openshift-debug-4c8st is created for debugging node... Starting pod/worker-0-2-debug-t4fhx ... To use host binaries, run `chroot /host` [kni@cert-rhosp-02 ~]$ oc get nodes -l 'node-role.kubernetes.io/worker' NAME STATUS ROLES AGE VERSION worker-0-0 Ready worker 11h v1.33.3 worker-0-1 Ready worker 11h v1.33.3 worker-0-2 NotReady worker 11h v1.33.3 [kni@cert-rhosp-02 ~]$ oc apply -f test.yaml fenceagentsremediation.fence-agents-remediation.medik8s.io/worker-0-2 created [kni@cert-rhosp-02 ~]$ oc get far -o yaml apiVersion: v1 items: - apiVersion: fence-agents-remediation.medik8s.io/v1alpha1 kind: FenceAgentsRemediation metadata: annotations: kubectl.kubernetes.io/last-applied-configuration: | {"apiVersion":"fence-agents-remediation.medik8s.io/v1alpha1","kind":"FenceAgentsRemediation","metadata":{"annotations":{},"name":"worker-0-2","namespace":"openshift-workload-availability"},"spec":{"agent":"fence_ipmilan","nodeSecretNames":{"worker-0-0":"worker-0-cred","worker-0-1":"worker-1-cred","worker-0-2":"worker-2-cred"},"nodeparameters":{"--ipport":{"master-0-0":"6230","master-0-1":"6231","master-0-2":"6232","worker-0-0":"6233","worker-0-1":"6234","worker-0-2":"6235"}},"retrycount":5,"retryinterval":"10s","sharedparameters":{"--action":"reboot","--ip":"192.168.123.1","--lanplus":"","--username":"admin"},"timeout":"300s"}} creationTimestamp: "2025-09-05T19:39:16Z" finalizers: - fence-agents-remediation.medik8s.io/far-finalizer generation: 2 name: worker-0-2 namespace: openshift-workload-availability resourceVersion: "248879" uid: 911f68ef-2ef5-4c51-9d72-d52bb1889e25 spec: agent: fence_ipmilan nodeSecretNames: worker-0-0: worker-0-cred worker-0-1: worker-1-cred worker-0-2: worker-2-cred nodeparameters: --ipport: master-0-0: "6230" master-0-1: "6231" master-0-2: "6232" worker-0-0: "6233" worker-0-1: "6234" worker-0-2: "6235" remediationStrategy: ResourceDeletion retrycount: 5 retryinterval: 10s sharedSecretName: fence-agents-credentials-shared sharedparameters: --action: reboot --ip: 192.168.123.1 --lanplus: "" --username: admin timeout: 5m0s status: conditions: - lastTransitionTime: "2025-09-05T19:39:16Z" message: FAR CR was found, its name matches one of the cluster nodes, and a finalizer was set to the CR reason: RemediationStarted status: "True" type: Processing - lastTransitionTime: "2025-09-05T19:39:21Z" message: FAR taint was added and the fence agent command has been created and executed successfully reason: FenceAgentSucceeded status: "True" type: FenceAgentActionSucceeded - lastTransitionTime: "2025-09-05T19:39:16Z" message: FAR CR was found, its name matches one of the cluster nodes, and a finalizer was set to the CR reason: RemediationStarted status: Unknown type: Succeeded lastUpdateTime: "2025-09-05T19:39:21Z" kind: List metadata: resourceVersion: "" [kni@cert-rhosp-02 ~]$ oc get nodes -l 'node-role.kubernetes.io/worker' NAME STATUS ROLES AGE VERSION worker-0-0 Ready worker 11h v1.33.3 worker-0-1 Ready worker 11h v1.33.3 worker-0-2 Ready worker 11h v1.33.3 [kni@cert-rhosp-02 ~]$ oc get nodes/worker-0-2 -o json | jq .spec.taints [ { "effect": "NoExecute", "key": "medik8s.io/fence-agents-remediation", "timeAdded": "2025-09-05T19:39:16Z" } ] [kni@cert-rhosp-02 ~]$ oc debug node/worker-0-2 -- chroot /host bash -c "uptime -s" Temporary namespace openshift-debug-pgcgk is created for debugging node... Starting pod/worker-0-2-debug-4ffq8 ... To use host binaries, run `chroot /host` Removing debug pod ... Temporary namespace openshift-debug-pgcgk was removed. error: unable to create the debug pod "worker-0-2-debug-4ffq8" Far Logs: ======================================================================================================================================================================== 2025-09-05T19:29:10.263030352Z INFO controllers.FenceAgentsRemediation Finalizer was removed {"CR Name": "worker-0-2-4v9st"} 2025-09-05T19:29:10.263083757Z INFO controllers.FenceAgentsRemediation Finish FenceAgentsRemediation Reconcile 2025-09-05T19:29:10.263302548Z DEBUG events [remediation] Finalizer was removed {"type": "Normal", "object": {"kind":"FenceAgentsRemediation","namespace":"openshift-workload-availability","name":"worker-0-2-4v9st","uid":"4029d989-ff40-44f3-806f-41d3976dfb5b","apiVersion":"fence-agents-remediation.medik8s.io/v1alpha1","resourceVersion":"244681"}, "reason": "RemoveFinalizer"} 2025-09-05T19:29:10.263410548Z INFO controllers.FenceAgentsRemediation Begin FenceAgentsRemediation Reconcile 2025-09-05T19:29:10.26344755Z INFO controllers.FenceAgentsRemediation FenceAgentsRemediation CR was not found {"CR Name": "worker-0-2-4v9st", "CR Namespace": "openshift-workload-availability"} 2025-09-05T19:29:10.263453429Z INFO controllers.FenceAgentsRemediation Finish FenceAgentsRemediation Reconcile 2025-09-05T19:39:16.3081271Z INFO fenceagentsremediation-resource validate create {"name": "worker-0-2"} 2025-09-05T19:39:16.311472162Z INFO controllers.FenceAgentsRemediation Begin FenceAgentsRemediation Reconcile 2025-09-05T19:39:16.311516953Z INFO controllers.FenceAgentsRemediation Check FAR CR's name 2025-09-05T19:39:16.314624281Z INFO fenceagentsremediation-resource validate update {"name": "worker-0-2"} 2025-09-05T19:39:16.317800193Z INFO controllers.FenceAgentsRemediation Finalizer was added {"CR Name": "worker-0-2"} 2025-09-05T19:39:16.317822388Z INFO controllers.FenceAgentsRemediation Updating Status Condition {"processingConditionStatus": "True", "fenceAgentActionSucceededConditionStatus": "Unknown", "succeededConditionStatus": "Unknown", "reason": "RemediationStarted", "LastUpdateTime": "2025-09-05T19:39:16.317821674Z"} 2025-09-05T19:39:16.318019474Z DEBUG events [remediation] Remediation started {"type": "Normal", "object": {"kind":"FenceAgentsRemediation","namespace":"openshift-workload-availability","name":"worker-0-2","uid":"911f68ef-2ef5-4c51-9d72-d52bb1889e25","apiVersion":"fence-agents-remediation.medik8s.io/v1alpha1","resourceVersion":"248803"}, "reason": "RemediationStarted"} 2025-09-05T19:39:16.31804927Z DEBUG events [remediation] Finalizer was added {"type": "Normal", "object": {"kind":"FenceAgentsRemediation","namespace":"openshift-workload-availability","name":"worker-0-2","uid":"911f68ef-2ef5-4c51-9d72-d52bb1889e25","apiVersion":"fence-agents-remediation.medik8s.io/v1alpha1","resourceVersion":"248803"}, "reason": "AddFinalizer"} 2025-09-05T19:39:16.523048545Z INFO controllers.FenceAgentsRemediation Finish FenceAgentsRemediation Reconcile 2025-09-05T19:39:16.52315528Z INFO controllers.FenceAgentsRemediation Begin FenceAgentsRemediation Reconcile 2025-09-05T19:39:16.523169091Z INFO controllers.FenceAgentsRemediation Check FAR CR's name 2025-09-05T19:39:16.530992116Z INFO taints Taint was added {"taint effect": "NoExecute", "taint list": [{"key":"node.kubernetes.io/unreachable","effect":"NoSchedule","timeAdded":"2025-09-05T19:39:05Z"},{"key":"node.kubernetes.io/unreachable","effect":"NoExecute","timeAdded":"2025-09-05T19:39:11Z"},{"key":"medik8s.io/fence-agents-remediation","effect":"NoExecute","timeAdded":"2025-09-05T19:39:16Z"}]} 2025-09-05T19:39:16.531051862Z INFO controllers.FenceAgentsRemediation FAR remediation taint was added {"Node Name": "worker-0-2"} 2025-09-05T19:39:16.531091924Z INFO controllers.FenceAgentsRemediation Build fence agent command line {"Fence Agent": "fence_ipmilan", "Node Name": "worker-0-2"} 2025-09-05T19:39:16.53112384Z INFO controllers.FenceAgentsRemediation found a value from secret {"secret name": "worker-2-cred", "parameter name": "--password"} 2025-09-05T19:39:16.531166433Z INFO controllers.FenceAgentsRemediation Execute the fence agent {"Fence Agent": "fence_ipmilan", "Node Name": "worker-0-2", "FAR uid": "911f68ef-2ef5-4c51-9d72-d52bb1889e25", "ParametersError": "json: unsupported type: iter.Seq[github.com/medik8s/fence-agents-remediation/api/v1alpha1.ParameterName]"} 2025-09-05T19:39:16.531444833Z DEBUG events [remediation] Remediation taint was added {"type": "Normal", "object": {"kind":"Node","name":"worker-0-2","uid":"681d8f9a-ac99-4792-bf19-b7824aee5f6c","apiVersion":"v1","resourceVersion":"248780"}, "reason": "AddRemediationTaint"} 2025-09-05T19:39:16.531481093Z DEBUG events [remediation] Fence agent was executed {"type": "Normal", "object": {"kind":"FenceAgentsRemediation","namespace":"openshift-workload-availability","name":"worker-0-2","uid":"911f68ef-2ef5-4c51-9d72-d52bb1889e25","apiVersion":"fence-agents-remediation.medik8s.io/v1alpha1","resourceVersion":"248805"}, "reason": "FenceAgentExecuted"} 2025-09-05T19:39:16.531472118Z INFO executer fence agent start {"uid": "911f68ef-2ef5-4c51-9d72-d52bb1889e25", "fence_agent": "fence_ipmilan", "retryCount": 5, "retryInterval": "10s", "timeout": "5m0s"} 2025-09-05T19:39:16.538736345Z INFO controllers.FenceAgentsRemediation Finish FenceAgentsRemediation Reconcile 2025-09-05T19:39:16.538796469Z INFO controllers.FenceAgentsRemediation Begin FenceAgentsRemediation Reconcile 2025-09-05T19:39:16.538826318Z INFO controllers.FenceAgentsRemediation Check FAR CR's name 2025-09-05T19:39:16.53888344Z INFO controllers.FenceAgentsRemediation A Fence Agent is already running {"Fence Agent": "fence_ipmilan", "Node Name": "worker-0-2", "FAR uid": "911f68ef-2ef5-4c51-9d72-d52bb1889e25"} 2025-09-05T19:39:16.55156206Z INFO controllers.FenceAgentsRemediation Finish FenceAgentsRemediation Reconcile 2025-09-05T19:39:21.474980466Z INFO executer command completed {"uid": "911f68ef-2ef5-4c51-9d72-d52bb1889e25", "response": "Success: Rebooted\n", "errMessage": "", "err": null} 2025-09-05T19:39:21.475026255Z INFO executer fence agent done {"uid": "911f68ef-2ef5-4c51-9d72-d52bb1889e25", "fence_agent": "fence_ipmilan", "stdout": "Success: Rebooted\n", "stderr": "", "err": null} 2025-09-05T19:39:21.475031763Z INFO executer updating status {"FAR uid": "911f68ef-2ef5-4c51-9d72-d52bb1889e25"} 2025-09-05T19:39:21.475124868Z INFO executer Updating Status Condition {"processingConditionStatus": "", "fenceAgentActionSucceededConditionStatus": "True", "succeededConditionStatus": "", "reason": "FenceAgentSucceeded", "LastUpdateTime": "2025-09-05T19:39:21.475124012Z"} 2025-09-05T19:39:21.475153138Z DEBUG events [remediation] Fence agent was succeeded {"type": "Normal", "object": {"kind":"FenceAgentsRemediation","namespace":"openshift-workload-availability","name":"worker-0-2","uid":"911f68ef-2ef5-4c51-9d72-d52bb1889e25","apiVersion":"fence-agents-remediation.medik8s.io/v1alpha1","resourceVersion":"248805"}, "reason": "FenceAgentSucceeded"} 2025-09-05T19:39:21.48219566Z INFO executer status updated {"FAR uid": "911f68ef-2ef5-4c51-9d72-d52bb1889e25"} 2025-09-05T19:39:21.482355023Z INFO controllers.FenceAgentsRemediation Begin FenceAgentsRemediation Reconcile 2025-09-05T19:39:21.482390836Z INFO controllers.FenceAgentsRemediation Check FAR CR's name 2025-09-05T19:39:21.48245521Z INFO controllers.FenceAgentsRemediation Remediation strategy is ResourceDeletion which explicitly deletes resources - manually deleting workload {"Node Name": "worker-0-2"} 2025-09-05T19:39:21.482596806Z DEBUG events [remediation] Manually delete pods from the unhealthy node {"type": "Normal", "object": {"kind":"Node","name":"worker-0-2","uid":"681d8f9a-ac99-4792-bf19-b7824aee5f6c","apiVersion":"v1","resourceVersion":"248808"}, "reason": "DeleteResources"} 2025-09-05T19:39:21.48329328Z INFO commons-resource starting to delete pods {"node name": "worker-0-2"} 2025-09-05T19:39:23.855533947Z INFO commons-resource done deleting pods {"node name": "worker-0-2"} 2025-09-05T19:39:23.855566494Z INFO controllers.FenceAgentsRemediation Updating Status Condition {"processingConditionStatus": "False", "fenceAgentActionSucceededConditionStatus": "", "succeededConditionStatus": "True", "reason": "RemediationFinishedSuccessfully", "LastUpdateTime": "2025-09-05T19:39:23.855565567Z"} 2025-09-05T19:39:23.855587981Z INFO executer cancelling fence agent routine {"uid": "911f68ef-2ef5-4c51-9d72-d52bb1889e25"} 2025-09-05T19:39:23.855594572Z INFO controllers.FenceAgentsRemediation FenceAgentsRemediation CR has completed to remediate the node {"Node Name": "worker-0-2"} 2025-09-05T19:39:23.855708445Z DEBUG events [remediation] Unhealthy node remediation was completed {"type": "Normal", "object": {"kind":"Node","name":"worker-0-2","uid":"681d8f9a-ac99-4792-bf19-b7824aee5f6c","apiVersion":"v1","resourceVersion":"248808"}, "reason": "NodeRemediationCompleted"} 2025-09-05T19:39:23.855767309Z DEBUG events [remediation] Remediation finished {"type": "Normal", "object": {"kind":"FenceAgentsRemediation","namespace":"openshift-workload-availability","name":"worker-0-2","uid":"911f68ef-2ef5-4c51-9d72-d52bb1889e25","apiVersion":"fence-agents-remediation.medik8s.io/v1alpha1","resourceVersion":"248879"}, "reason": "RemediationFinished"} 2025-09-05T19:39:24.062558742Z INFO controllers.FenceAgentsRemediation Finish FenceAgentsRemediation Reconcile 2025-09-05T19:39:24.062653158Z INFO controllers.FenceAgentsRemediation Begin FenceAgentsRemediation Reconcile 2025-09-05T19:39:24.062673269Z INFO controllers.FenceAgentsRemediation Check FAR CR's name 2025-09-05T19:39:24.068171672Z INFO controllers.FenceAgentsRemediation Finish FenceAgentsRemediation Reconcile 2025-09-05T19:45:29.892347371Z INFO controllers.FenceAgentsRemediation Begin FenceAgentsRemediation Reconcile 2025-09-05T19:45:29.892410334Z INFO controllers.FenceAgentsRemediation Check FAR CR's name 2025-09-05T19:45:29.892444218Z INFO controllers.FenceAgentsRemediation CR's deletion timestamp is not zero, and FAR finalizer exists {"CR Name": "worker-0-2"} 2025-09-05T19:45:29.899237922Z INFO taints Taint was removed {"taint effect": "NoExecute", "taint list": null} 2025-09-05T19:45:29.899267149Z INFO controllers.FenceAgentsRemediation FAR remediation taint was removed {"Node Name": "worker-0-2"} 2025-09-05T19:45:29.899562596Z DEBUG events [remediation] Remediation taint was removed {"type": "Normal", "object": {"kind":"Node","name":"worker-0-2","uid":"681d8f9a-ac99-4792-bf19-b7824aee5f6c","apiVersion":"v1","resourceVersion":"251194"}, "reason": "RemoveRemediationTaint"} 2025-09-05T19:45:29.907691678Z INFO fenceagentsremediation-resource validate update {"name": "worker-0-2"} 2025-09-05T19:45:29.913349294Z INFO controllers.FenceAgentsRemediation Finalizer was removed {"CR Name": "worker-0-2"} 2025-09-05T19:45:29.913397045Z INFO controllers.FenceAgentsRemediation Finish FenceAgentsRemediation Reconcile 2025-09-05T19:45:29.913442799Z INFO controllers.FenceAgentsRemediation Begin FenceAgentsRemediation Reconcile 2025-09-05T19:45:29.913459199Z INFO controllers.FenceAgentsRemediation FenceAgentsRemediation CR was not found {"CR Name": "worker-0-2", "CR Namespace": "openshift-workload-availability"} 2025-09-05T19:45:29.913462878Z INFO controllers.FenceAgentsRemediation Finish FenceAgentsRemediation Reconcile 2025-09-05T19:45:29.913472618Z DEBUG events [remediation] Finalizer was removed {"type": "Normal", "object": {"kind":"FenceAgentsRemediation","namespace":"openshift-workload-availability","name":"worker-0-2","uid":"911f68ef-2ef5-4c51-9d72-d52bb1889e25","apiVersion":"fence-agents-remediation.medik8s.io/v1alpha1","resourceVersion":"251198"}, "reason": "RemoveFinalizer"}