Description of problem:
New deployment of BM IPI using provisioning network with IPV6 is showing: http://XXXX:XXXX:XXXX:XXXX::X:6180/images/ironic-python-agernt.kernel.... connection timed out (http://ipxe.org/4c0a6092)" error
Version-Release number of selected component (if applicable):
Openshift 4.12.32 Also seen in Openshift 4.14.0-rc.5 when adding new nodes
How reproducible:
Very frequent
Steps to Reproduce:
1. Deploy cluster using BM with provided config 2. 3.
Actual results:
Consistent failures depending of the version of OCP used to deploy
Expected results:
No error, successful deployment
Additional info:
Things checked while the bootstrap host is active and the installation information is still valid (and failing): - tried downloading the "ironic-python-agent.kernel" file from different places (bootstrap, bastion hosts, another provisioned host) and in all cases it worked: [core@control-1-ru2 ~]$ curl -6 -v -o ironic-python-agent.kernel http://[XXXX:XXXX:XXXX:XXXX::X]:80/images/ironic-python-agent.kernel \* Trying XXXX:XXXX:XXXX:XXXX::X... \* TCP_NODELAY set % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Connected to XXXX:XXXX:XXXX:XXXX::X (xxxx:xxxx:xxxx:xxxx::x) port 80 #0) > GET /images/ironic-python-agent.kernel HTTP/1.1 > Host: [xxxx:xxxx:xxxx:xxxx::x] > User-Agent: curl/7.61.1 > Accept: */* > < HTTP/1.1 200 OK < Date: Fri, 27 Oct 2023 08:28:09 GMT < Server: Apache < Last-Modified: Thu, 26 Oct 2023 08:42:16 GMT < ETag: "a29d70-6089a8c91c494" < Accept-Ranges: bytes < Content-Length: 10657136 < { [14084 bytes data] 100 10.1M 100 10.1M 0 0 597M 0 --:--:-- --:--:-- --:--:-- 597M \* Connection #0 to host xxxx:xxxx:xxxx:xxxx::x left intact This verifies some of the components like the network setup and the httpd service running on ironic pods. - Also gathered listing of the contents of the ironic pod running in podman, specially in the shared directory. The contents of /shared/html/inspector.ipxe seems correct compared to a working installation, also all files look in place. - Logs from the ironic container shows the errors coming from the node being deployed, we also show here the curl log to compare: xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx - - [27/Oct/2023:08:19:55 +0000] "GET /images/ironic-python-agent.kernel HTTP/1.1" 400 226 "-" "iPXE/1.0.0+ (4bd064de)" xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx - - [27/Oct/2023:08:19:55 +0000] "GET /images/ironic-python-agent.kernel HTTP/1.1" 400 226 "-" "iPXE/1.0.0+ (4bd064de)" xxxx:xxxx:xxxx:xxxx::x - - [27/Oct/2023:08:20:23 +0000] "GET /images/ironic-python-agent.kernel HTTP/1.1" 200 10657136 "-" "curl/7.61.1" cxxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx - - [27/Oct/2023:08:20:23 +0000] "GET /images/ironic-python-agent.kernel HTTP/1.1" 400 226 "-" "iPXE/1.0.0+ (4bd064de)" xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx - - [27/Oct/2023:08:20:23 +0000] "GET /images/ironic-python-agent.kernel HTTP/1.1" 400 226 "-" "iPXE/1.0.0+ (4bd064de)" Seems like an issue with iPXE and IPV6
- blocks
-
OCPBUGS-27060 [ OCP 4.15] IPXE connection timed out
- Closed
- is cloned by
-
OCPBUGS-27060 [ OCP 4.15] IPXE connection timed out
- Closed
- links to
-
RHEA-2024:0041 OpenShift Container Platform 4.16.z bug fix update