-
Bug
-
Resolution: Can't Do
-
Normal
-
None
-
4.19.z
-
Quality / Stability / Reliability
-
False
-
-
None
-
Moderate
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
As per the upstream Kubernetes documentation, I've created a selectorless Service with a EndpointSlice. In the EndpointSlice, the spec.endpoints field is set to the IP of a different Service. This is one of the use cases for a selectorless Service from the upstream docs:
For example: You want to point your Service to a Service in a different Namespace or on another cluster.
The problem is that connections from a host-networked pod to such a Service will always timeout.
Here is an example setup to demonstrate:
oc project project1 oc get svc http-service NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE http-service ClusterIP 172.30.189.101 <none> 80/TCP 4h9m oc project project2 oc get svc chained-service NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE chained-service ClusterIP 172.30.188.118 <none> 80/TCP 4h15m oc get endpointslice NAME ADDRESSTYPE PORTS ENDPOINTS AGE chained-service IPv4 80 172.30.189.101 4h15m
Connections to a normal service from both host-networked pods and a normal pod will succeed.
# Curl from a normal pod to the http-service Service oc -c prometheus -n openshift-monitoring exec statefulset/prometheus-k8s -- curl --no-progress-meter 172.30.189.101 --max-time 10 Success! # Curl from a host-networked pod to the http-service Service oc -c machine-config-daemon -n openshift-machine-config-operator exec daemonset/machine-config-daemon -- curl --no-progress-meter 172.30.189.101 --max-time 10 Success!
But connections to the selectorless Service will always timeout from host-networked pods.
# Curl from a normal pod to the chained-service Service oc -c prometheus -n openshift-monitoring exec statefulset/prometheus-k8s -- curl --no-progress-meter 172.30.188.118 --max-time 10 Success! # Curl from a host-networked pod to the chained-service Service oc -c machine-config-daemon -n openshift-machine-config-operator exec daemonset/machine-config-daemon -- curl --no-progress-meter 172.30.188.118 --max-time 10 curl: (28) Connection timed out after 10001 milliseconds command terminated with exit code 28
Version-Release number of selected component (if applicable):
OpenShift Container Platform 4.19.6
How reproducible:
Easily
Steps to Reproduce:
1. Create a deployment and a Service that points to that deployment.
2. Create a selectorless Service with the port set to the port of the other Service.
3. Create a EndpointSlice where the endpoint is set to the IP of the other Service.
4. Access the selectorless Service IP from a normal pod and a host-networked pod.
Actual results:
From a normal pod, the request will succeed. From a host-networked pod, the request will time out.
Expected results:
Requests from both a normal pod and a host-networked pod will succeed.
Additional info: