-
Story
-
Resolution: Unresolved
-
Normal
-
None
-
None
-
2
-
False
-
None
-
False
-
-
-
CLOUD Ready for Development
User Story
As a cluster admin, I want the ControlPlaneMachineSet controller to tell me which nodes it thinks need updating, so that I can decide if I agree before enabling the controller.
Background
Enabling the CPMS controller should be low-stress. But when folks are coming out of an incident, it's nice to have a clear view of what the CPMS controller is planning to do and why. We currently get NeedsUpdateReplicas Progressing status like:
Observed 2 replica(s) in need of update
That's helpful, but it would be nice to know which replicas (Machine name) and, ideally, what the controller didn't like about the state of those Machines. That would allow the admin to notice things like tag-order (OCPBUGS-10394) and adjust them without the need to replace any Machines.
Steps
Update CPMS internal state to track not only the number of outdated Machines, but also the context about which Machine and why here. To be consumed here. Possibly all we get back from the machine provider is the names, and we don't know why the provider thought the Machine was outdated?
Possibly we attempt to log the diff at level 4 here, but the default log level is 2.
~~~~
Add that information along to the ControlPlaneMachineSet status to make it more accessible to admins (and also something that gets updated to Insights, for clusters that enable Insights)
Stakeholders
SREs responding to OHSS-34609 were missing this, and it took some guesswork to figure out what the CPMS controller was concerned about.
Definition of Done
- <Add items that need to be completed for this card>
- Docs
- <Add docs requirements for this card>
- Testing
- <Explain testing that will be added>