Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-10437

Windows pods are unable to resolve DNS records for services

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done-Errata
    • Icon: Critical Critical
    • 4.14.0
    • 4.12.z
    • Windows Containers
    • None
    • Critical
    • No
    • 3
    • WINC - Sprint 234
    • 1
    • False
    • Hide

      None

      Show
      None
    • N/A
    • Release Note Not Required

      Description of problem:

      Services should be reachable by DNS, see [kubernetes docs|https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#namespaces-of-services]. This is not the case with Windows pods on OpenShift, they are only able to connect to services through the service's IP.
      
      This behavior functions correctly as documented on Linux.
      

      Version-Release number of selected component (if applicable):

      OpenShift 4.13.0-0.nightly-2023-03-14-053612
      WMCO master
      
      

      How reproducible:

      Always
      

      Steps to Reproduce:

      1. Deploy a windows webserver + associated service 
      ```
      apiVersion: v1
      kind: Service
      metadata:
        name: win-webserver
        labels:
          app: win-webserver
      spec:
        ports:
          # the port that this service should serve on
        - port: 80
          targetPort: 80
        selector:
          app: win-webserver
        type: ClusterIP
      ---
      apiVersion: apps/v1
      kind: Deployment
      metadata:
        labels:
          app: win-webserver
        name: win-webserver
      spec:
        selector:
          matchLabels:
            app: win-webserver
        replicas: 1
        template:
          metadata:
            labels:
              app: win-webserver
            name: win-webserver
          spec:
            tolerations:
            - key: "os"
              value: "Windows"
              Effect: "NoSchedule"
            containers:
            - name: windowswebserver
              image: mcr.microsoft.com/windows/servercore:ltsc2022
              imagePullPolicy: IfNotPresent
              command:
              - powershell.exe
              - -command
              - $listener = New-Object System.Net.HttpListener; $listener.Prefixes.Add('http://*:80/'); $listener.Start();Write-Host('Listening at http://*:80/'); while ($listener.IsListening) { $context = $listener.GetContext(); $response = $context.Response; $content='<html><body><H1>Red Hat OpenShift + Windows Container Workloads</H1></body></html>'; $buffer = [System.Text.Encoding]::UTF8.GetBytes($content); $response.ContentLength64 = $buffer.Length; $response.OutputStream.Write($buffer, 0, $buffer.Length); $response.Close(); };
              securityContext:
                runAsNonRoot: false
                windowsOptions:
                  runAsUserName: "ContainerAdministrator"
            nodeSelector:
              kubernetes.io/os: windows
      ```
      
      2. Exec into the webserver pod and attempt to GET the service, through the cluster's DNS
      ```
      $ oc exec -it <WEBSERVER_POD> powershell
      PS C:\> Invoke-WebRequest win-webserver.<NAMESPACE>.svc.cluster.local -UseBasicParsing
      ```
      
      

      Actual results:

      The service is unreachable through DNS but is reachable through its IP
      ```
       oc get svc
      NAME                                              TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)     AGE
      win-webserver                                     ClusterIP   172.30.70.147   <none>        80/TCP      21m
      ```
      
      ```
      PS C:\> Invoke-WebRequest 172.30.70.147 -UseBasicParsing
      StatusCode        : 200
      StatusDescription : OK
      Content           : {60, 104, 116, 109...}
      RawContent        : HTTP/1.1 200 OK
                          Content-Length: 82
                          Date: Thu, 16 Mar 2023 21:48:36 GMT
                          Server: Microsoft-HTTPAPI/2.0
      
                          <html><body><H1>Red Hat OpenShift + Windows Container Workloads</H1></body></html>
      Headers           : {[Content-Length, 82], [Date, Thu, 16 Mar 2023 21:48:36 GMT], [Server, Microsoft-HTTPAPI/2.0]}
      RawContentLength  : 82
      ```
      
      ```
      PS C:\> Invoke-WebRequest win-webserver.openshift-windows-machine-config-operator.svc.cluster.local -UseBasicParsing
      Invoke-WebRequest : The remote name could not be resolved: 'win-webserver.openshift-windows-machine-config-operator.svc.cluster.local'
      ```
      

      Expected results:

      The service is reachable through both IP and DNS
      

      Additional info:

      Pod DNS:
      ```
      PS C:\> Get-DnsClient |Select-Object -Property InterfaceAlias,ConnectionSpecificSuffixSearchList
      
      InterfaceAlias                                                                                                 ConnectionSpecificSuffixSearchList
      --------------                                                                                                 ----------------------------------
      vEthernet (d92635d1a382d5d590bfa7c3ce38a6a7d92b7596a1e111fa507a8acc0d743c43_OVNKubernetesHybridOverlayNetwork) {us-west-2.compute.internal}
      Loopback Pseudo-Interface 6                                                                                    {}
      ```
      
      The ClusterDNS field is being set on each windows node's kubelet config:
      ```
      PS C:\Users\Administrator> cat /k/kubelet.conf
      {"kind":"KubeletConfiguration","apiVersion":"kubelet.config.k8s.io/v1beta1","syncFrequency":"0s","fileCheckFrequency":"0s","httpCheckFrequency":"0s","rotateCertificates":true,"serverTLSBootstrap":true,"authentication":{"x509":{"clientCAFile":"C:\\k\\kubel
                                                                                                                                                                                                                                                                   let-ca.crt"},"webhook":{"cacheTTL":"0s"},"anonymous":{"enabled":false}},"authorization":{"webhook":{"cacheAuthorizedTTL":"0s","cacheUnauthorizedTTL":"0s"}},"clusterDomain":"cluster.local","clusterDNS":["172.30.0.10"],"streamingConnectionIdleTimeout":"0s","
                                                                                                                                                                                                                                                                   "nodeStatusUpdateFrequency":"0s","nodeStatusReportFrequency":"0s","imageMinimumGCAge":"0s","volumeStatsAggPeriod":"0s","cgroupsPerQOS":false,"cpuManagerReconcilePeriod":"0s","runtimeRequestTimeout":"10m0s","maxPods":250,"kubeAPIQPS":50,"kubeAPIBurst":100,"
                                                                                                                                                                                                                                                                   "serializeImagePulls":false,"evictionPressureTransitionPeriod":"0s","featureGates":{"LegacyNodeRoleBehavior":false,"NodeDisruptionExclusion":true,"RotateKubeletServerCertificate":true,"SCTPSupport":true,"ServiceNodeExclusion":true,"SupportPodPidsLimit":tru
                                                                                                                                                                                                                                                                   ue},"memorySwap":{},"containerLogMaxSize":"50Mi","systemReserved":{"cpu":"500m","ephemeral-storage":"1Gi","memory":"1Gi"},"logging":{"flushFrequency":0,"verbosity":0,"options":{"json":{"infoBufferSize":"0"}}},"shutdownGracePeriod":"0s","shutdownGracePeriod
                                                                                                                                                                                                                                                                   dCriticalPods":"0s","enforceNodeAllocatable":[]}
      ```
      

              paravindh Aravindh Puthiyaparambil (Inactive)
              rh-ee-ssoto Sebastian Soto
              Aharon Rasouli Aharon Rasouli
              Votes:
              1 Vote for this issue
              Watchers:
              10 Start watching this issue

                Created:
                Updated:
                Resolved: