Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-60397

Host-network pods can't reach services that point at other services

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Moderate
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      As per the upstream Kubernetes documentation, I've created a selectorless Service with a EndpointSlice. In the EndpointSlice, the spec.endpoints field is set to the IP of a different Service. This is one of the use cases for a selectorless Service from the upstream docs:

      For example:
      You want to point your Service to a Service in a different Namespace or on another cluster.
      

      The problem is that connections from a host-networked pod to such a Service will always timeout.

      Here is an example setup to demonstrate:

      oc project project1
      oc get svc http-service
      NAME           TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)   AGE
      http-service   ClusterIP   172.30.189.101   <none>        80/TCP    4h9m
      
      oc project project2
      oc get svc chained-service
      NAME              TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)   AGE
      chained-service   ClusterIP   172.30.188.118   <none>        80/TCP    4h15m
      
      oc get endpointslice
      NAME              ADDRESSTYPE   PORTS   ENDPOINTS        AGE
      chained-service   IPv4          80      172.30.189.101   4h15m
      

      Connections to a normal service from both host-networked pods and a normal pod will succeed.

      # Curl from a normal pod to the http-service Service
      oc -c prometheus -n openshift-monitoring exec statefulset/prometheus-k8s -- curl --no-progress-meter 172.30.189.101 --max-time 10
      Success!
      
      # Curl from a host-networked pod to the http-service Service
      oc -c machine-config-daemon -n openshift-machine-config-operator exec daemonset/machine-config-daemon -- curl --no-progress-meter 172.30.189.101 --max-time 10
      Success!
      

      But connections to the selectorless Service will always timeout from host-networked pods.

      # Curl from a normal pod to the chained-service Service
      oc -c prometheus -n openshift-monitoring exec statefulset/prometheus-k8s -- curl --no-progress-meter 172.30.188.118 --max-time 10
      Success!
      
      # Curl from a host-networked pod to the chained-service Service
      oc -c machine-config-daemon -n openshift-machine-config-operator exec daemonset/machine-config-daemon -- curl --no-progress-meter 172.30.188.118 --max-time 10
      curl: (28) Connection timed out after 10001 milliseconds
      command terminated with exit code 28
      

      Version-Release number of selected component (if applicable):
      OpenShift Container Platform 4.19.6

      How reproducible:
      Easily

      Steps to Reproduce:

      1. Create a deployment and a Service that points to that deployment.

      2. Create a selectorless Service with the port set to the port of the other Service. 

      3. Create a EndpointSlice where the endpoint is set to the IP of the other Service. 

      4. Access the selectorless Service IP from a normal pod and a host-networked pod. 

      Actual results:

      From a normal pod, the request will succeed. From a host-networked pod, the request will time out. 

      Expected results:

      Requests from both a normal pod and a host-networked pod will succeed. 

      Additional info:

              bbennett@redhat.com Ben Bennett
              rhn-support-cuthayak Clark Uthayakumar
              None
              None
              Anurag Saxena Anurag Saxena
              None
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: