Uploaded image for project: 'Machine Config Operator'
  1. Machine Config Operator
  2. MCO-473

Decouple "Updating" from unavailable nodes, only factor in cordond/drain if it's "our" cordon/drain

XMLWordPrintable

    • Icon: Story Story
    • Resolution: Done
    • Icon: Normal Normal
    • None
    • None
    • None
    • False
    • None
    • False
    • OCPSTRAT-845 - [Tech Preview] Proper MCO State Reporting
    • 0
    • 0

      Our "Updating" condition is currently distilled from a set of other Node conditions which unfortunately include things potentially outside of the MCO's control (like DiskPressure, Unscheduleable, etc). We need to factor those conditions out unless we are responsible for them.

      With 4.12 we have controller cordons/drains, which leave node annotations on the Node object, e.g.:

      machineconfiguration.openshift.io/desiredDrain: uncordon-rendered-worker-561b9f700f58ed5ff139246f8d9a5b3c
      machineconfiguration.openshift.io/lastAppliedDrain: uncordon-rendered-worker-561b9f700f58ed5ff139246f8d9a5b3c
      

      We can use these to figure out whether or not we're actually in the process of doing anything rather than carrying our old assumption of "Unavailable nodes means we're updating" because a node could be cordoned by all sorts of things that aren't the MCO.

      I could see this resulting in another condition to capture "I have an obstacle to updating these nodes, but they have not experienced a failure and are not degraded".

       

       

              cdoern@redhat.com Charles Doern
              jkyros@redhat.com John Kyros
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: