Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-1880

Openshift version upgrade cause multiple worker go in draining node

XMLWordPrintable

    • ?
    • Important
    • None
    • Rejected
    • False
    • Hide

      None

      Show
      None
    • Hide
      The Machine Config Operator (MCO) no longer accepts setting the force file `/run/machine-config-daemon-force`, with the `MachineConfig` object. Before the {product-title} {product-version} release, this configuration would cause the Machine Config Daemon (MCD) to enter a reboot loop, which degrades the MCO's performance.

      (link:https://issues.redhat.com/browse/OCPBUGS-1880[*OCPBUGS-1880*])
      Show
      The Machine Config Operator (MCO) no longer accepts setting the force file `/run/machine-config-daemon-force`, with the `MachineConfig` object. Before the {product-title} {product-version} release, this configuration would cause the Machine Config Daemon (MCD) to enter a reboot loop, which degrades the MCO's performance. (link: https://issues.redhat.com/browse/OCPBUGS-1880 [* OCPBUGS-1880 *])
    • Bug Fix
    • Done

      Description of problem:

      machine config daemon pods cordon node but it does not start un-cordon process

        Issue is not observed in master and infra type of nodes 

       

      Tried keeping "maxUnavailable: 1" on worker mcp but results are same.

       

      We had suggested workaround :

      • pause worker mcp

      ~~~

       oc patch mcp worker  --type=merge -p '{"spec":{"paused":true }}'

      ~~~

      • perform version upgrade to 4.7 from stable release (wait till all operators are upgraded)
      • check recently rendered worker and chance node annotation "machineconfiguration.openshift.io/desiredConfig" manually

       - unpause worker mcp 

      • re-pause worker mcp (repeat process till the final worker node is updated)

        >> in this case, when worker mcp is un-paused three worker marked for drain 

       

       

      Version-Release number of selected component (if applicable):

      issue appeared in version upgrade 

      How reproducible:

      everytime 

      Steps to Reproduce:

      1.
      2.
      3.
      

      Actual results:

       

      Expected results:

       

      Additional info:

       

       

       

       

            walters@redhat.com Colin Walters
            rh-ee-dmule Dhananjay Mule
            Rio Liu Rio Liu
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: