Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-14614

4.14 Metal IPv6 Installs are worse than 4.13

    XMLWordPrintable

Details

    • No
    • 3
    • Metal Platform 238, Metal Platform 239, Metal Platform 240
    • 3
    • Approved
    • False
    • Hide

      None

      Show
      None

    Description

      Description of problem:

      
      TRT has identified a likely regression in Metal IPv6 installations.  4.14 installs are statistically worse than 4.13. We are working on a new tool called Component Readiness that does cross-release comparisons to ensure nothing get worse. I think it has found something in metal.
      
      At GA, 4.13 metal installs for ipv6 upgrade micro jobs were 100%.  They are now around 89% in 4.14.  All the failures seem to have the same mode where no workers come up, with PXE errors in the serial console.  
      
       !image-2023-06-06-10-13-13-310.png|thumbnail! 
      
      You can view the report here:
      
      https://sippy.dptools.openshift.org/sippy-ng/component_readiness/test_details?arch=amd64&baseEndTime=2023-05-16%2023%3A59%3A59&baseRelease=4.13&baseStartTime=2023-04-18%2000%3A00%3A00&capability=Other&component=Installer%20%2F%20openshift-installer&confidence=95&environment=ovn%20upgrade-micro%20amd64%20metal-ipi%20standard&excludeArches=arm64&excludeClouds=alibaba%2Cibmcloud%2Clibvirt%2Covirt&groupBy=cloud%2Carch%2Cnetwork&ignoreDisruption=true&ignoreMissing=false&minFail=3&network=ovn&pity=5&platform=metal-ipi&sampleEndTime=2023-06-06%2023%3A59%3A59&sampleRelease=4.14&sampleStartTime=2023-05-09%2000%3A00%3A00&testId=cluster%20install%3A0cb1bb27e418491b1ffdacab58c5c8c0&testName=install%20should%20succeed%3A%20overall&upgrade=upgrade-micro&variant=standard
      
      The serial console on the workers shows PXE errors:
      
      >>Start PXE over IPv4.
        PXE-E18: Server response timeout.
      BdsDxe: failed to load Boot0001 "UEFI PXEv4 (MAC:00962801D023)" from PciRoot(0x0)/Pci(0x2,0x0)/Pci(0x0,0x0)/MAC(00962801D023,0x1)/IPv4(0.0.0.0,0x0,DHCP,0.0.0.0,0.0.0.0,0.0.0.0): Not Found
      
      >>Start PXE over IPv6..
        Station IP address is FD00:1101:0:0:2EE1:8456:96FB:68B1
        Server IP address is FD00:1101:0:0:0:0:0:3
        NBP filename is snponly.efi
        NBP filesize is 0 Bytes
        PXE-E18: Server response timeout.
      BdsDxe: failed to load Boot0002 "UEFI PXEv6 (MAC:00962801D023)" from PciRoot(0x0)/Pci(0x2,0x0)/Pci(0x0,0x0)/MAC(00962801D023,0x1)/IPv6(0000:0000:0000:0000:0000:0000:0000:0000,0x0,Static,0000:0000:0000:0000:0000:0000:0000:0000,0x40,0000:0000:0000:0000:0000:0000:0000:0000): Not Found
      
      >>Start HTTP Boot over IPv4.
        Error: Could not retrieve NBP file size from HTTP server.
      
        Error: Server response timeout.
      BdsDxe: failed to load Boot0003 "UEFI HTTPv4 (MAC:00962801D023)" from PciRoot(0x0)/Pci(0x2,0x0)/Pci(0x0,0x0)/MAC(00962801D023,0x1)/IPv4(0.0.0.0,0x0,DHCP,0.0.0.0,0.0.0.0,0.0.0.0)/Uri(): Not Found
      
      >>Start HTTP Boot over IPv6..
        Error: Could not retrieve NBP file size from HTTP server.
      
        Error: Remote boot cancelled.
      BdsDxe: failed to load Boot0004 "UEFI HTTPv6 (MAC:00962801D023)" from PciRoot(0x0)/Pci(0x2,0x0)/Pci(0x0,0x0)/MAC(00962801D023,0x1)/IPv6(0000:0000:0000:0000:0000:0000:0000:0000,0x0,Static,0000:0000:0000:0000:0000:0000:0000:0000,0x40,0000:0000:0000:0000:0000:0000:0000:0000)/Uri(): Not Found
      BdsDxe: No bootable option or device was found.
      BdsDxe: Press any key to enter the Boot Manager Menu.
      
      
      
      

      Version-Release number of selected component (if applicable):

      
      4.14
      
      

      How reproducible:

      10%
      

      Steps to Reproduce:

      1. 
      2.
      3.
      

      Actual results:

      
      

      Expected results:

      
      

      Additional info:

      
      Example failures:
      https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.14-e2e-metal-ipi-upgrade-ovn-ipv6/1665428719952465920
      
      https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.14-e2e-metal-ipi-upgrade-ovn-ipv6/1664711616538611712
      
      https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.14-e2e-metal-ipi-upgrade-ovn-ipv6/1664645418744549376
      
      https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.14-e2e-metal-ipi-upgrade-ovn-ipv6/1663915360878858240
      
      
      
      
      

      Attachments

        Issue Links

          Activity

            People

              dhiggins@redhat.com Derek Higgins
              stbenjam Stephen Benjamin
              Pedro Jose Amoedo Martinez Pedro Jose Amoedo Martinez
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: