Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-27060

[ OCP 4.15] IPXE connection timed out

XMLWordPrintable

    • No
    • 1
    • Metal Platform 248
    • 1
    • False
    • Hide

      None

      Show
      None
    • Hide
      * Previously, the `inspector.ipxe` configuration used the `IRONIC_IP` variable, which did not account for IPv6 addresses because they have brackets. Consequently, when the user supplied an incorrect `boot_mac_address`, iPXE fell back to the `inspector.ipxe` configuration, which supplied a malformed IPv6 host header since it did not contain brackets.
      +
      With {product-title} {product-version}, the `inspector.ipxe` configuration has been updated to the the `IRONIC_URL_HOST` variable, which accounts for IPv6 addresses and resolves the issue. (link:https://issues.redhat.com/browse/OCPBUGS-27060[*OCPBUGS-27060*])
      Show
      * Previously, the `inspector.ipxe` configuration used the `IRONIC_IP` variable, which did not account for IPv6 addresses because they have brackets. Consequently, when the user supplied an incorrect `boot_mac_address`, iPXE fell back to the `inspector.ipxe` configuration, which supplied a malformed IPv6 host header since it did not contain brackets. + With {product-title} {product-version}, the `inspector.ipxe` configuration has been updated to the the `IRONIC_URL_HOST` variable, which accounts for IPv6 addresses and resolves the issue. (link: https://issues.redhat.com/browse/OCPBUGS-27060 [* OCPBUGS-27060 *])
    • Bug Fix
    • In Progress

      This is a clone of issue OCPBUGS-22699. The following is the description of the original issue:

      Description of problem:

      New deployment of BM IPI using provisioning network with IPV6 is showing:
      
      http://XXXX:XXXX:XXXX:XXXX::X:6180/images/ironic-python-agernt.kernel....
      connection timed out (http://ipxe.org/4c0a6092)" error

      Version-Release number of selected component (if applicable):

      Openshift 4.12.32
      Also seen in Openshift 4.14.0-rc.5 when adding new nodes

      How reproducible:

      Very frequent

      Steps to Reproduce:

      1. Deploy cluster using BM with provided config
      2.
      3.
      

      Actual results:

      Consistent failures depending of the version of OCP used to deploy

      Expected results:

      No error, successful deployment

      Additional info:

      Things checked while the bootstrap host is active and the installation information is still valid (and failing):
      - tried downloading the "ironic-python-agent.kernel" file from different places (bootstrap, bastion hosts, another provisioned host) and in all cases it worked:
      [core@control-1-ru2 ~]$ curl -6 -v -o ironic-python-agent.kernel http://[XXXX:XXXX:XXXX:XXXX::X]:80/images/ironic-python-agent.kernel
      \*   Trying XXXX:XXXX:XXXX:XXXX::X...
      \* TCP_NODELAY set
      % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                     Dload  Upload   Total   Spent    Left  Speed
      0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0* Connected to XXXX:XXXX:XXXX:XXXX::X (xxxx:xxxx:xxxx:xxxx::x) port 80   #0)
      > GET /images/ironic-python-agent.kernel HTTP/1.1
      > Host: [xxxx:xxxx:xxxx:xxxx::x]
      > User-Agent: curl/7.61.1
      > Accept: */*
      >
      < HTTP/1.1 200 OK
      < Date: Fri, 27 Oct 2023 08:28:09 GMT
      < Server: Apache
      < Last-Modified: Thu, 26 Oct 2023 08:42:16 GMT
      < ETag: "a29d70-6089a8c91c494"
      < Accept-Ranges: bytes
      < Content-Length: 10657136
      <
      { [14084 bytes data]
      100 10.1M  100 10.1M    0     0   597M      0 --:--:-- --:--:-- --:--:--  597M
      \* Connection #0 to host xxxx:xxxx:xxxx:xxxx::x left intact
      
      This verifies some of the components like the network setup and the httpd service running on ironic pods.
      
      - Also gathered listing of the contents of the ironic pod running in podman, specially in the shared directory. The contents of /shared/html/inspector.ipxe seems correct compared to a working installation, also all files look in place.
      
      - Logs from the ironic container shows the errors coming from the node being deployed, we also show here the curl log to compare:
      
      xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx - - [27/Oct/2023:08:19:55 +0000] "GET /images/ironic-python-agent.kernel HTTP/1.1" 400 226 "-" "iPXE/1.0.0+ (4bd064de)"
      xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx - - [27/Oct/2023:08:19:55 +0000] "GET /images/ironic-python-agent.kernel HTTP/1.1" 400 226 "-" "iPXE/1.0.0+ (4bd064de)"
      xxxx:xxxx:xxxx:xxxx::x - - [27/Oct/2023:08:20:23 +0000] "GET /images/ironic-python-agent.kernel HTTP/1.1" 200 10657136 "-" "curl/7.61.1"
      cxxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx - - [27/Oct/2023:08:20:23 +0000] "GET /images/ironic-python-agent.kernel HTTP/1.1" 400 226 "-" "iPXE/1.0.0+ (4bd064de)"
      xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx - - [27/Oct/2023:08:20:23 +0000] "GET /images/ironic-python-agent.kernel HTTP/1.1" 400 226 "-" "iPXE/1.0.0+ (4bd064de)"
      
      Seems like an issue with iPXE and IPV6

       

       

            hroy@redhat.com Himanshu Roy
            openshift-crt-jira-prow OpenShift Prow Bot
            Steeve Goveas Steeve Goveas
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: