-
Bug
-
Resolution: Done-Errata
-
Major
-
4.14
-
No
-
3
-
Metal Platform 238, Metal Platform 239, Metal Platform 240
-
3
-
Approved
-
False
-
Description of problem:
TRT has identified a likely regression in Metal IPv6 installations. 4.14 installs are statistically worse than 4.13. We are working on a new tool called Component Readiness that does cross-release comparisons to ensure nothing get worse. I think it has found something in metal. At GA, 4.13 metal installs for ipv6 upgrade micro jobs were 100%. They are now around 89% in 4.14. All the failures seem to have the same mode where no workers come up, with PXE errors in the serial console. !image-2023-06-06-10-13-13-310.png|thumbnail! You can view the report here: https://sippy.dptools.openshift.org/sippy-ng/component_readiness/test_details?arch=amd64&baseEndTime=2023-05-16%2023%3A59%3A59&baseRelease=4.13&baseStartTime=2023-04-18%2000%3A00%3A00&capability=Other&component=Installer%20%2F%20openshift-installer&confidence=95&environment=ovn%20upgrade-micro%20amd64%20metal-ipi%20standard&excludeArches=arm64&excludeClouds=alibaba%2Cibmcloud%2Clibvirt%2Covirt&groupBy=cloud%2Carch%2Cnetwork&ignoreDisruption=true&ignoreMissing=false&minFail=3&network=ovn&pity=5&platform=metal-ipi&sampleEndTime=2023-06-06%2023%3A59%3A59&sampleRelease=4.14&sampleStartTime=2023-05-09%2000%3A00%3A00&testId=cluster%20install%3A0cb1bb27e418491b1ffdacab58c5c8c0&testName=install%20should%20succeed%3A%20overall&upgrade=upgrade-micro&variant=standard The serial console on the workers shows PXE errors: >>Start PXE over IPv4. PXE-E18: Server response timeout. BdsDxe: failed to load Boot0001 "UEFI PXEv4 (MAC:00962801D023)" from PciRoot(0x0)/Pci(0x2,0x0)/Pci(0x0,0x0)/MAC(00962801D023,0x1)/IPv4(0.0.0.0,0x0,DHCP,0.0.0.0,0.0.0.0,0.0.0.0): Not Found >>Start PXE over IPv6.. Station IP address is FD00:1101:0:0:2EE1:8456:96FB:68B1 Server IP address is FD00:1101:0:0:0:0:0:3 NBP filename is snponly.efi NBP filesize is 0 Bytes PXE-E18: Server response timeout. BdsDxe: failed to load Boot0002 "UEFI PXEv6 (MAC:00962801D023)" from PciRoot(0x0)/Pci(0x2,0x0)/Pci(0x0,0x0)/MAC(00962801D023,0x1)/IPv6(0000:0000:0000:0000:0000:0000:0000:0000,0x0,Static,0000:0000:0000:0000:0000:0000:0000:0000,0x40,0000:0000:0000:0000:0000:0000:0000:0000): Not Found >>Start HTTP Boot over IPv4. Error: Could not retrieve NBP file size from HTTP server. Error: Server response timeout. BdsDxe: failed to load Boot0003 "UEFI HTTPv4 (MAC:00962801D023)" from PciRoot(0x0)/Pci(0x2,0x0)/Pci(0x0,0x0)/MAC(00962801D023,0x1)/IPv4(0.0.0.0,0x0,DHCP,0.0.0.0,0.0.0.0,0.0.0.0)/Uri(): Not Found >>Start HTTP Boot over IPv6.. Error: Could not retrieve NBP file size from HTTP server. Error: Remote boot cancelled. BdsDxe: failed to load Boot0004 "UEFI HTTPv6 (MAC:00962801D023)" from PciRoot(0x0)/Pci(0x2,0x0)/Pci(0x0,0x0)/MAC(00962801D023,0x1)/IPv6(0000:0000:0000:0000:0000:0000:0000:0000,0x0,Static,0000:0000:0000:0000:0000:0000:0000:0000,0x40,0000:0000:0000:0000:0000:0000:0000:0000)/Uri(): Not Found BdsDxe: No bootable option or device was found. BdsDxe: Press any key to enter the Boot Manager Menu.
Version-Release number of selected component (if applicable):
4.14
How reproducible:
10%
Steps to Reproduce:
1. 2. 3.
Actual results:
Expected results:
Additional info:
Example failures: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.14-e2e-metal-ipi-upgrade-ovn-ipv6/1665428719952465920 https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.14-e2e-metal-ipi-upgrade-ovn-ipv6/1664711616538611712 https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.14-e2e-metal-ipi-upgrade-ovn-ipv6/1664645418744549376 https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.14-e2e-metal-ipi-upgrade-ovn-ipv6/1663915360878858240
- is cloned by
-
OCPBUGS-17040 4.14 Metal IPv6 Installs are worse than 4.13
- Closed
- links to
-
RHSA-2023:5006 OpenShift Container Platform 4.14.z security update