[OCPBUGS-31444] Wrong dnsPolicy is used for konnectivity-agent in data plane - Red Hat Issue Tracker

Type: Bug
Resolution: Done-Errata
Priority: Normal
Fix Version/s: 4.16.0
Affects Version/s: 4.14, 4.15
Component/s: HyperShift
Labels:
- triaged

Severity:
Moderate
Regression:
No
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Release Note Text:

Hide
* Previously, the `konnectivity-agent` daemonset used the `ClusterIP` DNS policy. As a result, when CoreDNS was down, the `konnectivity-agent` pods on the data plane could not resolve the proxy-server-address, and they could fail to `konnectivity-server` in the control plane. With this update, the `konnectivity-agent` daemonset was modified to use `dnsPolicy: Default`. The `konnectivity-agent` uses the host system DNS service to look up the proxy server address, and it does not depend on CoreDNS anymore. (link:https://issues.redhat.com/browse/OCPBUGS-31444[*~~OCPBUGS-31444~~*])

Show
* Previously, the `konnectivity-agent` daemonset used the `ClusterIP` DNS policy. As a result, when CoreDNS was down, the `konnectivity-agent` pods on the data plane could not resolve the proxy-server-address, and they could fail to `konnectivity-server` in the control plane. With this update, the `konnectivity-agent` daemonset was modified to use `dnsPolicy: Default`. The `konnectivity-agent` uses the host system DNS service to look up the proxy server address, and it does not depend on CoreDNS anymore. (link: https://issues.redhat.com/browse/OCPBUGS-31444 [* OCPBUGS-31444 *])
Release Note Type:
Bug Fix
Release Note Status:
Done
Target Version:

4.16.0
Target Backport Versions:

4.14, 4.15

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Description of problem:

The konnectivity-agent on the data plane needs to resolve its proxy-server-url to connect the control plane's konnectivity server. Also, the these agents are using the default dnsPolicy which is ClusterFirst.

This creates a dependency with CoreDNS. If CoreDNS is misconfigured or down, agents won't able to connect to the server, and all konnectivity related traffic goes down (blocks updates, webhooks, logs, etc).

The correction would to use the dnsPolicy: Default in the konnectivity-agent daemonset on the data plane, so it would use the name resolution configuration from the node.

This makes sure that the konnectivity-agent's proxy-server-url can be resolved even if coreDNS is down or mis-configured

The konnectivity-agent control plane deployment shall not change as it still needs to use coreDNS as in that case a ClusterIP Service is configured as proxy-server-url.

Version-Release number of selected component (if applicable):

4.14, 4.15

How reproducible:

Break coreDNS configuration

Steps to Reproduce:

1. Put an invalid forwarder to the dns.operator/default to fail upstream DNS resolving
2. Rollout restart the konnectivity-agent daemonset in kube-system

Actual results:

kubectl log is failing

Expected results:

kubectl log is working

Additional info:

blocks

OCPBUGS-31826 Wrong dnsPolicy is used for konnectivity-agent in data plane

Closed

is cloned by

OCPBUGS-31826 Wrong dnsPolicy is used for konnectivity-agent in data plane

Closed

links to

openshift/hypershift#3810: OCPBUGS-31444: use dnsPolicy: Default for konnectivity-agent in data plane

RHEA-2024:0041 OpenShift Container Platform 4.16.z bug fix update

Assignee:: Adam Mihelcsik (Inactive)

Reporter:: Adam Mihelcsik (Inactive)

QA Contact:: Jie Zhao

Doc Contact:: Laura Hinson

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Created:: 2024/03/27 1:02 PM

Updated:: 2024/06/27 11:44 AM

Resolved:: 2024/06/27 11:44 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates

Hide