Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-15605

CoreDNS UDP bufsize unnecessarily restricted to 512

XMLWordPrintable

    • +
    • No
    • 2
    • Sprint 238, Sprint 239, Sprint 240, Sprint 241
    • 4
    • Rejected
    • False
    • Hide

      None

      Show
      None
    • Hide
      * Previously, the CoreDNS bufsize setting was configured as 512 bytes. Now, the maximum size of the buffer for {product-title} CoreDNS is 1232 bytes. This modification enhances DNS performance by reducing the occurrence of DNS truncations and retries. (link:https://issues.redhat.com/browse/OCPBUGS-15605[*OCPBUGS-15605*])
      Show
      * Previously, the CoreDNS bufsize setting was configured as 512 bytes. Now, the maximum size of the buffer for {product-title} CoreDNS is 1232 bytes. This modification enhances DNS performance by reducing the occurrence of DNS truncations and retries. (link: https://issues.redhat.com/browse/OCPBUGS-15605 [* OCPBUGS-15605 *])
    • Bug Fix
    • Done

      Description of problem:

      As endorsed at DNS Flag Day, the DNS Community recommends a bufsize setting of 1232 as a safe default that supports larger payloads, while generally avoiding IP fragmentation on most networks. This is particularly relevant for payloads like those generated by DNSSEC, which tend to be larger.

      Previously, CoreDNS always used the EDNS0 extension, which enables UDP-based DNS queries to exceed 512 bytes, when CoreDNS forwarded DNS queries to an upstream name server, and so OpenShift specified a bufsize setting of 512 to maintain compatibility with applications and name servers that did not support the EDNS0 extension.

      For clients and name servers that do support EDNS0, a bufsize setting of 512 can result in more DNS truncation and unnecessary TCP retransmissions, resulting in worse DNS performance for most users. This is due to the fact that if a response is larger than the bufsize setting, it gets truncated, prompting clients to initiate a TCP retry. In this situation, two DNS requests are made for a single DNS answer, leading to higher bandwidth usage and longer response times.

      Currently, CoreDNS no longer uses EDNS0 when forwarding requests if the original client request did not use EDNS0 (ref: coredns/coredns@a5b9749), and so the reasoning for using a bufsize setting of 512 no longer applies. By increasing the bufsize setting to the recommended value of 1232 bytes, we can enhance DNS performance by decreasing the probability of DNS truncations.

      Using a larger bufsize setting of 1232 bytes also would potentially help alleviate bugs like https://issues.redhat.com/browse/OCPBUGS-6829 in which a non-compliant upstream DNS is not respecting a bufsize of 512 bytes and sending larger-than-512-bytes responses. A bufsize setting of 1232 bytes doesn't fix the root cause of this issue; rather, it decreases the likelihood of its occurrence by increasing the acceptable size range for UDP responses.

      Note that clients that don’t support EDNS0 or TCP, such as applications built using older versions of Alpine Linux, are still subject to the aforementioned truncation issue. To avoid these issues, ensure that your application is built using a DNS resolver library that supports EDNS0 or TCP-based DNS queries.

      Brief history of OpenShift's Bufsize changes:

      1. During the development of OpenShift 4.8.0, we updated to 1232 bytes due to Bug - 1949361 and backported to 4.7 and 4.6. However, later on, 4.8.0 (in development), 4.7, and 4.6 were reverted back to 512 bytes due to Bug - 1966116.
      2. Also in OpenShift 4.8.0, we bumped CoreDNS to v1.8.1, and picked up a commit that forced DNS queries that did not have the DO Bit (DNSSEC) set to set bufsize as 2048 bytes despite 512 bytes being set in the configuration.
      3. In OpenShift 4.12.0, we fixed OCPBUGS-240 to limit all DNS queries, specifically queries that had DO Bit off, to what is configured in the configuration file (512 bytes) and we backported the fix to 4.11, 4.10, and 4.9.
      4. Now, this PR is changing bufsize to 1232 bytes.

      Version-Release number of selected component (if applicable):

      4.14, 4.13, 4.12. 4.11

      How reproducible:

      100%

      Steps to Reproduce:

      1. oc -n openshift-dns get configmaps/dns-default -o yaml | grep -i bufsize

      Actual results:

      Bufsize = 512

      Expected results:

      Bufsize = 1232

      Additional info:

       

              gspence@redhat.com Grant Spence
              gspence@redhat.com Grant Spence
              Melvin Joseph Melvin Joseph
              Votes:
              3 Vote for this issue
              Watchers:
              14 Start watching this issue

                Created:
                Updated:
                Resolved: