-
Bug
-
Resolution: Done-Errata
-
Major
-
4.13, 4.12, 4.11, 4.14
-
+
-
No
-
2
-
Sprint 238, Sprint 239, Sprint 240, Sprint 241
-
4
-
Rejected
-
False
-
-
-
Bug Fix
-
Done
Description of problem:
As endorsed at DNS Flag Day, the DNS Community recommends a bufsize setting of 1232 as a safe default that supports larger payloads, while generally avoiding IP fragmentation on most networks. This is particularly relevant for payloads like those generated by DNSSEC, which tend to be larger.
Previously, CoreDNS always used the EDNS0 extension, which enables UDP-based DNS queries to exceed 512 bytes, when CoreDNS forwarded DNS queries to an upstream name server, and so OpenShift specified a bufsize setting of 512 to maintain compatibility with applications and name servers that did not support the EDNS0 extension.
For clients and name servers that do support EDNS0, a bufsize setting of 512 can result in more DNS truncation and unnecessary TCP retransmissions, resulting in worse DNS performance for most users. This is due to the fact that if a response is larger than the bufsize setting, it gets truncated, prompting clients to initiate a TCP retry. In this situation, two DNS requests are made for a single DNS answer, leading to higher bandwidth usage and longer response times.
Currently, CoreDNS no longer uses EDNS0 when forwarding requests if the original client request did not use EDNS0 (ref: coredns/coredns@a5b9749), and so the reasoning for using a bufsize setting of 512 no longer applies. By increasing the bufsize setting to the recommended value of 1232 bytes, we can enhance DNS performance by decreasing the probability of DNS truncations.
Using a larger bufsize setting of 1232 bytes also would potentially help alleviate bugs like https://issues.redhat.com/browse/OCPBUGS-6829 in which a non-compliant upstream DNS is not respecting a bufsize of 512 bytes and sending larger-than-512-bytes responses. A bufsize setting of 1232 bytes doesn't fix the root cause of this issue; rather, it decreases the likelihood of its occurrence by increasing the acceptable size range for UDP responses.
Note that clients that don’t support EDNS0 or TCP, such as applications built using older versions of Alpine Linux, are still subject to the aforementioned truncation issue. To avoid these issues, ensure that your application is built using a DNS resolver library that supports EDNS0 or TCP-based DNS queries.
Brief history of OpenShift's Bufsize changes:
- During the development of OpenShift 4.8.0, we updated to 1232 bytes due to Bug - 1949361 and backported to 4.7 and 4.6. However, later on, 4.8.0 (in development), 4.7, and 4.6 were reverted back to 512 bytes due to Bug - 1966116.
- Also in OpenShift 4.8.0, we bumped CoreDNS to v1.8.1, and picked up a commit that forced DNS queries that did not have the DO Bit (DNSSEC) set to set bufsize as 2048 bytes despite 512 bytes being set in the configuration.
- In OpenShift 4.12.0, we fixed OCPBUGS-240 to limit all DNS queries, specifically queries that had DO Bit off, to what is configured in the configuration file (512 bytes) and we backported the fix to 4.11, 4.10, and 4.9.
- Now, this PR is changing bufsize to 1232 bytes.
Version-Release number of selected component (if applicable):
4.14, 4.13, 4.12. 4.11
How reproducible:
100%
Steps to Reproduce:
1. oc -n openshift-dns get configmaps/dns-default -o yaml | grep -i bufsize
Actual results:
Bufsize = 512
Expected results:
Bufsize = 1232
Additional info:
- relates to
-
OCPBUGS-6829 while/after upgrading to OKD 4.11 2023-01-14 CoreDNS has a problem with UDP overflows
- Closed
- links to
-
RHSA-2023:5006 OpenShift Container Platform 4.14.z security update