-
Task
-
Resolution: Unresolved
-
Normal
-
None
-
None
-
False
-
-
False
-
-
-
Moderate
1. Automatic node-to-IP mapping
Allow the operator to automatically extract IP (or OOB management IP) from node labels (e.g. something like fencing.redhat.com/oob-ip=10.14.0.9).
Added label to all Node objects that contains the IP address or hostname of the iDRAC interface - e.g.:
~~~
labels:
itup.redhat.com/idrac: control02-mgmt.itup-002.prod.tlv2.dc.redhat.com
~~~
If we were able to reference such a label in the FARtemplate then we wouldn't have to hardcode the IP address of each node in the fartemplate itself.
Similarly there could be another label which would contain the secret with credentials - e.g.
~~~
labels:
itup.redhat.com/idrac: control02-mgmt.itup-002.prod.tlv2.dc.redhat.com
itup.redhat.com/idrac-credentials-secret: control02-idrac-secret
~~~
Then there would have to be a way to reference both those labels in the FARtemplate.
2. sharedSecretName does not fits our needs.
3. https://issues.redhat.com//browse/RHWA-97
It doesn't seem to allow us to specify something like this:
~~~
sharedSecretName: ".NodeName-idrac"
~~~
If this was possible, we could have the credentials along with the ip address stored in the secret and nothing would have to be hardocded in the FARteamplate.
This would be another way to solve our problem, but I would still prefer the first option (extract it from the labels on the Node objects).
4. Configuration testing / dry-run mode. FAR should support a safe test mode:
- I think this would have to be solved such a way that it can't be used in the NHC - I mean the successful test operation should not result in workloads restarting on another nodes - maybe by a new field in the FAR object indicating the successful test.
- If the instructions for the user would be to get into the pod and manually execute the fence agent, providing all the parameters manually on the command line then I don't think that solves the problem, because it would not test the configuration of the FARtemplate itself.
5. Automated validation / health checks:
https://issues.redhat.com//browse/RHWA-65
but I think it solves the problem only partially. This won't be able to verify the config with the "status" action as the template itself doesn't define what nodes it should be applied to as that is defined in the NHC instead.
I rather meant this point to work with the proposed "4. Configuration testing / dry-run mode" above.
Related discussion at https://redhat-internal.slack.com/archives/CR8HZL4P3/p1764850940560529