Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-651

CBO gets confused by Terminating ports when a master fails

    XMLWordPrintable

Details

    • Important
    • 2
    • Metal Platform 224, Metal Platform 225, Metal Platform 226, Metal Platform 227, Metal Platform 228, Metal Platform 229, Metal Platform 230
    • 7
    • Rejected
    • False
    • Hide

      None

      Show
      None
    • Hide
      Before the {product-title} {product-version} release, the Cluster Baremetal Operator (CBO) would incorrectly detect a terminated `Metal3` pod as an active pod. If you migrated this `Metal3` pod to another control plane node on {product-title}, you could not start the pod.

      For the {product-title} {product-version} release, the CBO no longer detects a terminated `Metal3` pod as an active pod.

      (link:https://issues.redhat.com/browse/OCPBUGS-651[*OCPBUGS-651*])
      Show
      Before the {product-title} {product-version} release, the Cluster Baremetal Operator (CBO) would incorrectly detect a terminated `Metal3` pod as an active pod. If you migrated this `Metal3` pod to another control plane node on {product-title}, you could not start the pod. For the {product-title} {product-version} release, the CBO no longer detects a terminated `Metal3` pod as an active pod. (link: https://issues.redhat.com/browse/OCPBUGS-651 [* OCPBUGS-651 *])
    • Done

    Description

      Description of problem:

      If a master fails and is drained, the old copy of the metal3 pod gets stuck in Terminating state for some (possibly long) time. While the new pod works correctly, CBO expects only one port to exist and thus cannot determine the applicable Ironic IP address.

      Version-Release number of selected component (if applicable):

       

      How reproducible:

      always

      Steps to Reproduce:

      1. On dev-scripts: virsh destroy <VM with metal3 pod>
      2. Wait for drain to happen or trigger it manually
      3. Check CBO logs
      

      Actual results:

      "unable to determine Ironic's IP to pass to the machine-image-customization-controller: there should be only one pod listed for the given label"

      Expected results:

      CBO reconfigures its pods with the new Ironic IP

      Additional info:

      I don't know how to filter out pods in Terminating state...

      Attachments

        Activity

          People

            rhn-engineering-dtantsur Dmitry Tantsur
            rhn-engineering-dtantsur Dmitry Tantsur
            Jad Haj Yahya Jad Haj Yahya
            Darragh Fitzmaurice Darragh Fitzmaurice
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: