Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-55364

Race condition when handling hostname in ironic-agent

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • 1
    • Important
    • None
    • None
    • Rejected
    • Metal Platform 270
    • 1
    • Done
    • Bug Fix
    • Fixes a race condition during provisioning which, in case of a slow DHCP response, could cause different hostnames to be used for Machine and Node objects, preventing CSR's of worker Nodes from being automatically approved.
    • None
    • None
    • None
    • None

      This is a clone of issue OCPBUGS-55315. The following is the description of the original issue:

      It seems that the hostname handling code in our downstream ironic-agent plugin is racy. The sequence of events as recovered from a CI job run:

      1. IPA starts and polls all hardware manager
      2. Our hardware managers detects the hostname, fixes it if needed and records in a variable
      3. IPA proceeds with its regular business, including inspection
      4. During inspection, a different hostname is returned from socket.gethostname, and it gets reported back to Ironic and thus BMO.
      5. During OS installation, our plugin configures Ignition with the earlier hostname, not the one reported in Inspection.
      6. This way, the Machine resource ends up with the later (correct) hostname, while the Node boots with an earlier hostname. They mismatch, and the Node does not get approved.

              rhn-engineering-dtantsur Dmitry Tantsur
              openshift-crt-jira-prow OpenShift Prow Bot
              None
              None
              Jad Haj Yahya Jad Haj Yahya
              None
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated:
                Resolved: