Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-12851

Sometimes Drain and Cordon events are not triggered when MCO executes drain and cordon operations

XMLWordPrintable

    • Low
    • No
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      
      Sometimes when MCO drains the nodes in a pool in order to apply a MC, there is no Drain event triggered, even if the drain operation is actually executed.
      
      Same happens with Cordon.
      
      
      

      Version-Release number of selected component (if applicable):

      OCP 4.14, but we have seen it in previous versions too.
      

      How reproducible:

      Intermittent, Rare
      
      We are not able to reproduce it at will, but every now and then we can see this issue happening and reported by our automated test cases that check the Drain and Cordon events.
      
      

      Steps to Reproduce:

      
      The must-gather file that we provide in this issue belongs to the following steps:
      
      1. Patch the image.config resource to add a new search registry
      
      $ oc  patch image.config cluster --type merge -p '{"spec": {"registrySources": {"containerRuntimeSearchRegistries":["quay.io"]}}}'
      
      2. Wait for the first node to be configured, verify that the drain event was triggered and that the right dropin file was created
      
      3. Patch image.config resource again to restore the initial config
      
      $ oc  patch image.config cluster --type json -p '[{ "op": "add", "path": "/spec", "value": {}}]'
      
      
      
      

      Actual results:

      What we see is that when we patch the image.config resource and the new config is applied, the master and worker pools are drained, but only the master pool is reporting the Drain events. 
      
      When we patch again the image.config resource to restore the original configuration, both pools report the events properly.
      
      
      

      Expected results:

      
      Every time a MCO drains a node, there should be a Drain event triggered for this node in the "default" namespace. Same with Cordon.
      
      

      Additional info:

      In the first comment we link the must-gather file
      

            team-mco Team MCO
            sregidor@redhat.com Sergio Regidor de la Rosa
            Sergio Regidor de la Rosa Sergio Regidor de la Rosa
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: