Loading...

XML

Word

Printable

Type: Feature Request
Resolution: Done
Priority: Major
Fix Version/s: openshift-4.15
Affects Version/s: None
Component/s: SDN
Labels:
None

Blocked:
False
Ready:
False
Market:
PX Impact Score:
PX Priority Data:
PX Review Complete:

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Proposed title of this feature request
- Enable network-check-source pod scheduling to nodes of master role or infra role
What is the nature and description of the request?
- Since OCP4.7 network connection health checks performed by controllers in openshift-network-diagnostics namespace.
- However this is not working normally in OCP clusters having worker nodes in separated networks.
- Because network-check-source pod deployed in any nodes randomly. So, if the pod is running on a network isolated worker node, podnetworkconnectivitycheck doesn't be created normally. the pod should be scheduled on a node that communicates to every node. But, users can't schedule network-check-source pod to a specific node.
- If the pod run in a master node, this issue will be resolved.
- Users can not schedule the network-check-source pod in openshift-network-diagnostics now because Cluster Network Operator manages the resource.[1]network-check-source
  https://github.com/openshift/cluster-network-operator/blob/master/bindata/network-diagnostics/network-check-source.yaml
Why does the customer need this? (List the business requirements here)
- Customer's OCP cluster having worker nodes in separated networks environment can't use openshift-network-diagnostics
- In the customer's production OCP cluster, users can't check entire cluster nodes' network status
List any affected packages or components.
- Cluster Network Operator
- openshift-network-diagnostics namespace
- network-check-source
How reproducible:

#1. configure two worker group nodes in separated networks
workerA.testocp.lab.com
workerB.testocp.lab.com
Two nodes can't connect to each other.

#2. network-check-source pod is running on workerA node.
$oc get pods -o wide |grep worker -n openshift-network-diagnostics
network-check-source-644477f5f5-hwfbl 1/1 Running 0 47h 172.31.96.9 workerA.testocp.lab.com <none> <none>
network-check-target-8wjvv 1/1 Running 0 17h 172.31.12.5 workerA.testocp.lab.com <none> <none>
network-check-target-szd9g 1/1 Running 0 17h 172.31.93.3 workerB.testocp.lab.com <none> <none>

$oc get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
network-check-source ClusterIP None <none> 17698/TCP 57d
network-check-target ClusterIP 172.30.140.241 <none> 80/TCP 57d

#3. curl test from network-check-source pod to network-check-target pods
$oc rsh network-check-source-644477f5f5-hwfbl (this pod is running workerA node)

sh-4.4$curl 172.31.12.5:8080 //request to workerA network-check-target is success
Hello

sh-4.4$curl 172.31.93.3:8080 //request to workerB network-check-target failed to establish a TCP connection to 172.31.93.3:8080: dial tcp 172.31.93.3:8080: connect: no route to host

#4. $oc get podnetworkconnectivitycheck -n openshift-network-diagnostics
NAME
network-check-source-workerA-to-*
==> only network-check-source-workerA-to-* podnetworkconnectivitycheck was created
network-check-source-workerB-to-* was not created

#5. below event log keep being created per 6min
$oc get event
1m18s Normal ConnectivityRestored node/workerA.testocp.lab.com Connectivity restored after 7m0.706557921s: network-check-target-service-cluster: tcp connection to network-check-target:80 succeeded
1m18s Warning ConnectivityOutageDetected node/workerA.testocp.lab.com Connectivity outage detected: network-check-target-service-cluster: failed to establish a TCP connection to network-check-target:80: dial tcp 172.30.140.241:80: connect: no route to host

Assignee:: Marc Curry

Reporter:: Sophia Hyosun Kim

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Created:: 2021/11/25 1:31 AM

Updated:: 2023/11/07 5:16 PM

Resolved:: 2022/02/03 7:37 AM

Details

Description

Attachments

Activity

People

Dates

Hide