Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-33134

Nodes are drained twice when an OCB image is applied

XMLWordPrintable

    • Moderate
    • None
    • MCO Sprint 256, MCO Sprint 257
    • 2
    • False
    • Hide

      None

      Show
      None
    • Hide
      * Previously, nodes could be drained twice because the node was queued multiple times in the drain controller. This behaviour might have been due to increased activity on the node object by on-cluster layering functionality. With this fix, a node queued for drain only once. (link:https://issues.redhat.com/browse/OCPBUGS-33134[*OCPBUGS-33134])
      Show
      * Previously, nodes could be drained twice because the node was queued multiple times in the drain controller. This behaviour might have been due to increased activity on the node object by on-cluster layering functionality. With this fix, a node queued for drain only once. (link: https://issues.redhat.com/browse/OCPBUGS-33134 [* OCPBUGS-33134 ])
    • Bug Fix
    • Done

      Description of problem:

      When a OCB is enabled, and a new MC is created, nodes are drained twice when the resulting osImage build is applied.
      
          

      Version-Release number of selected component (if applicable):

      4.16
          

      How reproducible:

      Always
          

      Steps to Reproduce:

          1. Enable OCB in the worker pool
      
      oc create -f - << EOF
      apiVersion: machineconfiguration.openshift.io/v1alpha1
      kind: MachineOSConfig
      metadata:
        name: worker
      spec:
        machineConfigPool:
          name: worker
        buildInputs:
          imageBuilder:
            imageBuilderType: PodImageBuilder
          baseImagePullSecret:
            name: $(oc get secret -n openshift-config pull-secret -o json | jq "del(.metadata.namespace, .metadata.creationTimestamp, .metadata.resourceVersion, .metadata.uid, .metadata.name)" | jq '.metadata.name="pull-copy"' | oc -n openshift-machine-config-operator create -f - &> /dev/null; echo -n "pull-copy")
          renderedImagePushSecret:
            name: $(oc get -n openshift-machine-config-operator sa builder -ojsonpath='{.secrets[0].name}')
          renderedImagePushspec: "image-registry.openshift-image-registry.svc:5000/openshift-machine-config-operator/ocb-image:latest"
      EOF
      
      
      
          2. Wait for the image to be built
      
          3. When the opt-in image has been finished and applied create a new MC
      
      apiVersion: machineconfiguration.openshift.io/v1
      kind: MachineConfig
      metadata:
        labels:
          machineconfiguration.openshift.io/role: worker
        name: test-machine-config-1
      spec:
        config:
          ignition:
            version: 3.1.0
          storage:
            files:
            - contents:
                source: data:text/plain;charset=utf-8;base64,dGVzdA==
              filesystem: root
              mode: 420
              path: /etc/test-file-1.test
      
          4. Wait for the image to be built
          

      Actual results:

      Once the image is built it is applied to the worker nodes.
      
      If we have a look at the drain operation, we can see that every worker node was drained twice instead of once:
      
      oc -n openshift-machine-config-operator logs $(oc -n openshift-machine-config-operator get pods -l k8s-app=machine-config-controller -o jsonpath='{.items[0].metadata.name}') -c machine-config-controller | grep "initiating drain"
      I0430 13:28:48.740300       1 drain_controller.go:182] node ip-10-0-70-208.us-east-2.compute.internal: initiating drain
      I0430 13:30:08.330051       1 drain_controller.go:182] node ip-10-0-70-208.us-east-2.compute.internal: initiating drain
      I0430 13:32:32.431789       1 drain_controller.go:182] node ip-10-0-69-154.us-east-2.compute.internal: initiating drain
      I0430 13:33:50.643544       1 drain_controller.go:182] node ip-10-0-69-154.us-east-2.compute.internal: initiating drain
      I0430 13:48:08.183488       1 drain_controller.go:182] node ip-10-0-70-208.us-east-2.compute.internal: initiating drain
      I0430 13:49:01.379416       1 drain_controller.go:182] node ip-10-0-70-208.us-east-2.compute.internal: initiating drain
      I0430 13:50:52.933337       1 drain_controller.go:182] node ip-10-0-69-154.us-east-2.compute.internal: initiating drain
      I0430 13:52:12.191203       1 drain_controller.go:182] node ip-10-0-69-154.us-east-2.compute.internal: initiating drain
      
      
          

      Expected results:

      Nodes should drained only once when applying a new MC
          

      Additional info:

          

            djoshy David Joshy
            sregidor@redhat.com Sergio Regidor de la Rosa
            Sergio Regidor de la Rosa Sergio Regidor de la Rosa
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: