-
Bug
-
Resolution: Duplicate
-
Undefined
-
None
-
4.15
-
None
-
None
-
False
-
Description of problem:
While creating NNCP in disconnected (air gapped) clusters, where root-servers.net isn't resolvable, there is an extra 2 minutes of delay before NNCP is successfully applied due to DNS probes checking root-servers.net resolve. It happened two times to me in the last month while performing OCP PoC in air gapped environments. I'd like to either have it explicitly documented somewhere (requirements for air gapped OCP installations maybe?) or ability to disable that DNS probe. Please note that expecting root-servers.net is being resolvable in air gapped installations may be challenging as it may require interference with core network infrastructure of highly restricted environments.
Version-Release number of selected component (if applicable):
How reproducible:
Always when local DNS servers don't resolve root-servers.net
Steps to Reproduce
1) NNCP is being created, logs from nmstate-handler on one of the nodes: {"level":"info","ts":"2024-07-17T10:32:14.383Z","logger":"enactmentstatus","msg":"status: {DesiredState:interfaces:\n- bridge:\n port:\n - name: enp8s0\n name: ovn-backup\n state: up\n type: ovs-bridge\novn:\n bridge-mappings:\n - bridge: ovn-backup\n localnet: backup\n state: present\n DesiredStateMetaInfo:{Version: TimeStamp:0001-01-01 00:00:00 +0000 UTC} CapturedStates:map[] PolicyGeneration:3 Conditions:[]}","enactment":"rhocp-compute-0.ovn-backup"} {"level":"info","ts":"2024-07-17T10:32:14.414Z","logger":"enactmentconditions","msg":"NotifyProgressing","enactment":"rhocp-compute-0.ovn-backup"} {"level":"info","ts":"2024-07-17T10:32:14.422Z","logger":"enactmentstatus","msg":"status: {DesiredState:interfaces:\n- bridge:\n port:\n - name: enp8s0\n name: ovn-backup\n state: up\n type: ovs-bridge\novn:\n bridge-mappings:\n - bridge: ovn-backup\n localnet: backup\n state: present\n DesiredStateMetaInfo:{Version: TimeStamp:0001-01-01 00:00:00 +0000 UTC} CapturedStates:map[] PolicyGeneration:3 Conditions:[{Type:Progressing Status:True Reason:ConfigurationProgressing Message:Applying desired state LastHeartbeatTime:2024-07-17 10:32:14.422332523 +0000 UTC m=+429667.913220316 LastTransitionTime:2024-07-17 10:32:14.422332523 +0000 UTC m=+429667.913220316} {Type:Failing Status:Unknown Reason:ConfigurationProgressing Message: LastHeartbeatTime:2024-07-17 10:32:14.422333071 +0000 UTC m=+429667.913220822 LastTransitionTime:2024-07-17 10:32:14.422333071 +0000 UTC m=+429667.913220822} {Type:Available Status:Unknown Reason:ConfigurationProgressing Message: LastHeartbeatTime:2024-07-17 10:32:14.422334054 +0000 UTC m=+429667.913221805 LastTransitionTime:2024-07-17 10:32:14.422334054 +0000 UTC m=+429667.913221805} {Type:Pending Status:False Reason:ConfigurationProgressing Message: LastHeartbeatTime:2024-07-17 10:32:14.422334503 +0000 UTC m=+429667.913222254 LastTransitionTime:2024-07-17 10:32:14.422334503 +0000 UTC m=+429667.913222254} {Type:Aborted Status:False Reason:ConfigurationProgressing Message: LastHeartbeatTime:2024-07-17 10:32:14.42233463 +0000 UTC m=+429667.913222384 LastTransitionTime:2024-07-17 10:32:14.42233463 +0000 UTC m=+429667.913222384}]}","enactment":"rhocp-compute-0.ovn-backup"} {"level":"error","ts":"2024-07-17T10:32:14.972Z","logger":"probe","msg":"failed checking DNS connectivity","error":"[lookup root-servers.net on 10.195.14.253:53: server misbehaving]","stacktrace":"github.com/nmstate/kubernetes-nmstate/pkg/probe.runDNS\n\t/go/src/github.com/openshift/kubernetes-nmstate/pkg/probe/probes.go:255\ngithub.com/nmstate/kubernetes-nmstate/pkg/probe.dnsCondition.func1\n\t/go/src/github.com/openshift/kubernetes-nmstate/pkg/probe/probes.go:219\nk8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext.func1\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:49\nk8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:50\nk8s.io/apimachinery/pkg/util/wait.PollUntilContextTimeout\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/k8s.io/apimachinery/pkg/util/wait/poll.go:48\ngithub.com/nmstate/kubernetes-nmstate/pkg/probe.Select\n\t/go/src/github.com/openshift/kubernetes-nmstate/pkg/probe/probes.go:277\ngithub.com/nmstate/kubernetes-nmstate/pkg/client.ApplyDesiredState\n\t/go/src/github.com/openshift/kubernetes-nmstate/pkg/client/client.go:159\ngithub.com/nmstate/kubernetes-nmstate/controllers/handler.(*NodeNetworkConfigurationPolicyReconciler).Reconcile\n\t/go/src/github.com/openshift/kubernetes-nmstate/controllers/handler/nodenetworkconfigurationpolicy_controller.go:226\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:122\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:323\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:235"} (...) After two minutes: {"level":"error","ts":"2024-07-17T10:34:14.534Z","logger":"probe","msg":"failed checking DNS connectivity","error":"[lookup root-servers.net on 10.195.14.253:53: server misbehaving]","stacktrace":"github.com/nmstate/kubernetes-nmstate/pkg/probe.runDNS\n\t/go/src/github.com/openshift/kubernetes-nmstate/pkg/probe/probes.go:255\ngithub.com/nmstate/kubernetes-nmstate/pkg/probe.dnsCondition.func1\n\t/go/src/github.com/openshift/kubernetes-nmstate/pkg/probe/probes.go:219\nk8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext.func2\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:73\nk8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:74\nk8s.io/apimachinery/pkg/util/wait.PollUntilContextTimeout\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/k8s.io/apimachinery/pkg/util/wait/poll.go:48\ngithub.com/nmstate/kubernetes-nmstate/pkg/probe.Select\n\t/go/src/github.com/openshift/kubernetes-nmstate/pkg/probe/probes.go:277\ngithub.com/nmstate/kubernetes-nmstate/pkg/client.ApplyDesiredState\n\t/go/src/github.com/openshift/kubernetes-nmstate/pkg/client/client.go:159\ngithub.com/nmstate/kubernetes-nmstate/controllers/handler.(*NodeNetworkConfigurationPolicyReconciler).Reconcile\n\t/go/src/github.com/openshift/kubernetes-nmstate/controllers/handler/nodenetworkconfigurationpolicy_controller.go:226\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:122\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:323\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:235"} {"level":"info","ts":"2024-07-17T10:34:14.705Z","logger":"probe","msg":"WARNING not selecting dns probe"} {"level":"info","ts":"2024-07-17T10:34:16.538Z","logger":"probe","msg":"Running 'ping' probe"} {"level":"info","ts":"2024-07-17T10:34:16.952Z","logger":"probe","msg":"Running 'api-server' probe"} {"level":"info","ts":"2024-07-17T10:34:16.997Z","logger":"probe","msg":"Running 'node-readiness' probe"} {"level":"info","ts":"2024-07-17T10:34:17.022Z","logger":"controllers.NodeNetworkConfigurationPolicy","msg":"nmstate","nodenetworkconfigurationpolicy":{"name":"ovn-backup"},"output":"setOutput: route-rules: {}\nroutes: {}\ninterfaces:\n- name: ovn-backup\n type: ovs-bridge\n state: up\n bridge:\n port:\n - name: enp8s0\novn:\n bridge-mappings:\n - localnet: backup\n state: present\n bridge: ovn-backup\n\n \n"} {"level":"info","ts":"2024-07-17T10:34:17.023Z","logger":"enactmentconditions","msg":"NotifySuccess","enactment":"rhocp-compute-0.ovn-backup"} {"level":"info","ts":"2024-07-17T10:34:17.031Z","logger":"enactmentstatus","msg":"status: {DesiredState:interfaces:\n- bridge:\n port:\n - name: enp8s0\n name: ovn-backup\n state: up\n type: ovs-bridge\novn:\n bridge-mappings:\n - bridge: ovn-backup\n localnet: backup\n state: present\n DesiredStateMetaInfo:{Version: TimeStamp:0001-01-01 00:00:00 +0000 UTC} CapturedStates:map[] PolicyGeneration:3 Conditions:[{Type:Progressing Status:False Reason:SuccessfullyConfigured Message: LastHeartbeatTime:2024-07-17 10:34:17.031838985 +0000 UTC m=+429790.522726737 LastTransitionTime:2024-07-17 10:34:17.031838985 +0000 UTC m=+429790.522726737} {Type:Failing Status:False Reason:SuccessfullyConfigured Message: LastHeartbeatTime:2024-07-17 10:34:17.031838839 +0000 UTC m=+429790.522726590 LastTransitionTime:2024-07-17 10:34:17.031838839 +0000 UTC m=+429790.522726590} {Type:Available Status:True Reason:SuccessfullyConfigured Message:successfully reconciled LastHeartbeatTime:2024-07-17 10:34:17.031838506 +0000 UTC m=+429790.522726281 LastTransitionTime:2024-07-17 10:34:17.031838506 +0000 UTC m=+429790.522726281} {Type:Pending Status:False Reason:SuccessfullyConfigured Message: LastHeartbeatTime:2024-07-17 10:34:17.031839155 +0000 UTC m=+429790.522726908 LastTransitionTime:2024-07-17 10:32:14 +0000 UTC} {Type:Aborted Status:False Reason:SuccessfullyConfigured Message: LastHeartbeatTime:2024-07-17 10:34:17.031839373 +0000 UTC m=+429790.522727124 LastTransitionTime:2024-07-17 10:32:14 +0000 UTC}]}","enactment":"rhocp-compute-0.ovn-backup"} {"level":"info","ts":"2024-07-17T10:34:17.048Z","logger":"controllers.NodeNetworkConfigurationPolicy.forceNNSRefresh","msg":"forcing NodeNetworkState refresh after NNCP applied","node":"rhocp-compute-0"} DIG ran from the node: sh-4.4# dig root-servers.net. @10.195.14.253; <<>> DiG 9.11.36-RedHat-9.11.36-3.el8_6.7 <<>> root-servers.net. @10.195.14.253 ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: REFUSED, id: 61858 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 1232 ;; QUESTION SECTION: ;root-servers.net. IN A;; Query time: 0 msec ;; SERVER: 10.195.14.253#53(10.195.14.253) ;; WHEN: Wed Jul 17 11:23:01 UTC 2024 ;; MSG SIZE rcvd: 45 dnsmasq configuration (in the lab where I confirmed this behaviour): port=53 domain-needed bogus-priv no-resolv no-poll log-queries address=/apps.ocp4.openshift.one/10.195.14.101 address=/api.ocp4.openshift.one/10.195.14.100 address=/api-int.ocp4.openshift.one/10.195.14.100 address=/quay.ocp4.openshift.one/10.195.14.253 address=/bastion.ocp4.openshift.one/10.195.14.253 interface=eth1 no-dhcp-interface=eth1 bind-interfaces no-hosts
Actual results:
Expected results:
Additional info:
- is documented by
-
OCPBUGS-37415 [Doc] - NMState requires root-servers.net to be resolvable even in disconnected (air gapped) environments
- Closed
- is related to
-
OCPBUGS-37415 [Doc] - NMState requires root-servers.net to be resolvable even in disconnected (air gapped) environments
- Closed