Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-78085

DNS operator converts dual-stack service to single-stack when updating service spec

    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • Important
    • None
    • None
    • Proposed
    • NI&D Sprint 285
    • 1
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      The cluster-dns-operator inadvertently converts the dual-stack openshift-dns` service to single-stack IPv4 when performing service updates, permanently breaking dual-stack DNS functionality.

      Version-Release number of selected component (if applicable):

      All supported versions of OCP (introduced around 4.7 or 4.8)

      How reproducible:

          100%

      Steps to Reproduce:

      1. Dual-stack cluster with `openshift-dns` service configured with both IPv4 and IPv6 ClusterIPs
      2. Operator performs any service update triggered by manifest changes (e.g., PR #457 adding `trafficDistribution: PreferSameNode`)
      3. Service is updated and loses IPv6 configuration
           

      Actual results:

      When the operator updates the `openshift-dns` service, it overwrites the entire spec but only preserves `ClusterIP`. The `ClusterIPs` array is wiped and re-initialized from the single  `ClusterIP` value, losing the IPv6 address:
      
        # Before update
        clusterIPs: [172.30.0.10, fd02::c4a8]
        ipFamilies: [IPv4, IPv6]
        ipFamilyPolicy: PreferDualStack
        
        # After update
        clusterIPs: [172.30.0.10]          # IPv6 lost
        ipFamilies: [IPv4]
        ipFamilyPolicy: SingleStack 
      
      The service is converted from dual-stack to single-stack, breaking IPv6 DNS resolution (until service is deleted and recreated). 

      Expected results:

      The operator should preserve all API-managed dual-stack fields during updates

      Additional info:

      This bug was exposed during OpenShift 4.22 development when PR #457 (NE-2414) added `trafficDistribution: PreferSameNode` to the static service manifest, triggering service updates on existing dual-stack clusters.   
      
      However, the underlying bug in `serviceChanged()` has existed in all supported OpenShift versions. Any change that triggers a service update (e.g., attempted changes to ports, annotations, or other managed spec fields) would cause the same dual-stack to single-stack conversion. PR #457 happened to be the first change in recent releases to expose this existing issue.
      
      As a result, the CI test was failing:
      [sig-network-edge] DNS should answer A and AAAA queries for a dual-stack service [apigroup:config.openshift.io] [Suite:openshift/conformance/parallel]

              btofelrh Brett Tofel
              gspence@redhat.com Grant Spence
              Melvin Joseph Melvin Joseph
              None
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated:
                Resolved: