Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-78053

DNS dual-stack service A/AAAA query test permafailing on metal dualstack since Mar 6 payload

    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • Approved
    • NI&D Sprint 285
    • 1
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem

      The DNS dual-stack service test is permafailing on all metal/baremetal dualstack jobs (both default and techpreview FeatureSets) since the 4.22.0-0.nightly-2026-03-06-163638 payload. The test verifies that DNS answers both A (IPv4) and AAAA (IPv6) queries for a dual-stack service. It times out waiting for the expected DNS responses.

      Test details report

      The test was 100% passing before Mar 6 and has failed on every single dualstack metal job run since (22+ consecutive failures). No other platform or network stack is affected — aws, gcp, azure, and single-stack metal jobs all pass at 100%.

      This regression was detected by Component Readiness and is a release blocker.

      Test Name(s)

      [sig-network-edge] DNS should answer A and AAAA queries for a dual-stack service [apigroup:config.openshift.io] [Suite:openshift/conformance/parallel]
      

      Test ID: openshift-tests:c8177f2c070c9d16fff1ffb988a79d98
      Regression IDs: 36226, 36215
      Regression opened: 2026-03-07

      Version-Release number

      4.22

      How reproducible

      Always (100% failure rate on dualstack metal jobs since Mar 6)

      Steps to Reproduce

      1. Observe any dualstack metal nightly job:
        • periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ipi-ovn-dualstack
        • periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ipi-ovn-dualstack-techpreview
      2. Check test results for the DNS dual-stack service test
      3. The test fails every run with a timeout

      Actual results

      Test fails with:

      fail [github.com/openshift/origin/test/extended/dns/dns.go:251]: Failed: timed out waiting for the condition
      

      All observed failures (22+) show the exact same error (100% consistency). The test is an isolated failure in each job run — not part of broader co-failures.

      Expected results

      DNS should respond with both A and AAAA records for dual-stack services on metal clusters. The test should pass as it did prior to Mar 6.

      Affected Variants

      • Platform: metal
      • Architecture: amd64
      • Network: ovn
      • NetworkStack: dual
      • Topology: ha
      • FeatureSet: default (regression 36215) and techpreview (regression 36226)
      • Upgrade: none
      • Not affected: aws, gcp, azure, single-stack (ipv4/ipv6) metal jobs

      Failure Pattern Analysis

      • Pattern: Permafail — 22+ consecutive failures since Mar 6, 0 passes
      • Previous pass rate: 100% (30+ consecutive passes before Mar 6)
      • Isolation: Isolated test failure — no consistent co-failing tests
      • Consistency: 100% of failures show identical error message
      • Global pass rate: 95.4% (failures entirely from dualstack metal variants)

      Suspect PRs in Payload

      First failure payload: 4.22.0-0.nightly-2026-03-06-163638

      Additional info

      • Last passing run: 2029726217795538944 (Mar 6 01:10)
      • First failing run: 2029960378095505408 (Mar 6 16:40)
      • The dualstack-only nature of this failure strongly suggests the issue is in how A+AAAA records are served for dual-stack services on baremetal, not single-stack DNS.

              btofelrh Brett Tofel
              rhn-engineering-dgoodwin Devan Goodwin
              Hongan Li Hongan Li
              None
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated: