Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-249

Incorrect NAT when using cluster networking in control-plane nodes to install a VRRP Cluster

XMLWordPrintable

    • None
    • False
    • Hide

      None

      Show
      None
    • N/A
    • Release Note Not Required

      +++ This bug was initially created as a clone of
      Bug #2070318
      +++

      Description of problem:
      In OCP VRRP deployment (using OCP cluster networking), we have an additional data interface which is configured along with the regular management interface in each control node. In some deployments, the kubernetes address 172.30.0.1:443 is nat’ed to the data management interface instead of the mgmt interface (10.40.1.4:6443 vs 10.30.1.4:6443 as we configure the boostrap node) even though the default route is set to 10.30.1.0 network. Because of that, all requests to 172.30.0.1:443 were failed. After 10-15 minutes, OCP magically fixes it and nat’ing correctly to 10.30.1.4:6443.

      Version-Release number of selected component (if applicable):

      How reproducible:

      Steps to Reproduce:

      1.Provision OCP cluster using cluster networking for DNS & Load Balancer instead of external DNS & Load Balancer. Provision the host with 1 management interface and an additional interface for data network. Along with OCP manifest, add manifest to create a pod which will trigger communication with kube-apiserver.

      2.Start cluster installation.

      3.Check on the custom pod log in the cluster when the first 2 master nodes were installing to see GET operation to kube-apiserver timed out. Check nft table and chase the ip chains to see the that the data IP address was nat'ed to kubernetes service IP address instead of the management IP. This is not happening all the time, we have seen 50:50 chance.

      Actual results:
      After 10-15 minutes OCP will correct that by itself.

      Expected results:
      Wrong natting should not happen.

      Additional info:
      ClusterID: 24bbde0b-79b3-4ae6-afc5-cb694fa48895
      ClusterVersion: Stable at "4.8.29"
      ClusterOperators:
      clusteroperator/authentication is not available (OAuthServerRouteEndpointAccessibleControllerAvailable: Get "
      https://oauth-openshift.apps.ocp-binhle-wqepch.contrail.juniper.net/healthz
      ": context deadline exceeded (Client.Timeout exceeded while awaiting headers)) because OAuthServerRouteEndpointAccessibleControllerDegraded: Get "
      https://oauth-openshift.apps.ocp-binhle-wqepch.contrail.juniper.net/healthz
      ": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
      clusteroperator/baremetal is degraded because metal3 deployment inaccessible
      clusteroperator/console is not available (RouteHealthAvailable: failed to GET route (
      https://console-openshift-console.apps.ocp-binhle-wqepch.contrail.juniper.net/health
      ): Get "
      https://console-openshift-console.apps.ocp-binhle-wqepch.contrail.juniper.net/health
      ": context deadline exceeded (Client.Timeout exceeded while awaiting headers)) because RouteHealthDegraded: failed to GET route (
      https://console-openshift-console.apps.ocp-binhle-wqepch.contrail.juniper.net/health
      ): Get "
      https://console-openshift-console.apps.ocp-binhle-wqepch.contrail.juniper.net/health
      ": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
      clusteroperator/dns is progressing: DNS "default" reports Progressing=True: "Have 4 available DNS pods, want 5."
      clusteroperator/ingress is degraded because The "default" ingress controller reports Degraded=True: DegradedConditions: One or more other status conditions indicate a degraded state: CanaryChecksSucceeding=False (CanaryChecksRepetitiveFailures: Canary route checks for the default ingress controller are failing)
      clusteroperator/insights is degraded because Unable to report: unable to build request to connect to Insights server: Post "
      https://cloud.redhat.com/api/ingress/v1/upload
      ": dial tcp: lookup cloud.redhat.com on 172.30.0.10:53: read udp 10.128.0.26:53697->172.30.0.10:53: i/o timeout
      clusteroperator/network is progressing: DaemonSet "openshift-network-diagnostics/network-check-target" is not available (awaiting 1 nodes)

      — Additional comment from
      bnemec@redhat.com
      on 2022-03-30 20:00:25 UTC —

      This is not managed by runtimecfg, but in order to route the bug correctly I need to know which CNI plugin you're using - OpenShiftSDN or OVNKubernetes. Thanks.

      — Additional comment from
      lpbinh@gmail.com
      on 2022-03-31 08:09:11 UTC —

      Hi Ben,

      We were deploying Contrail CNI with OCP. However, this issue happens at very early deployment time, right after the bootstrap node is started
      and there's no SDN/CNI there yet.

      — Additional comment from
      bnemec@redhat.com
      on 2022-03-31 15:26:23 UTC —

      Okay, I'm just going to send this to the SDN team then. They'll be able to provide more useful input than I can.

      — Additional comment from
      trozet@redhat.com
      on 2022-04-04 15:22:21 UTC —

      Can you please provide the iptables rules causing the DNAT as well as the routes on the host? Might be easiest to get a sosreport during initial bring up during that 10-15 min when the problem occurs.

      — Additional comment from
      lpbinh@gmail.com
      on 2022-04-05 16:45:13 UTC —

      All nodes have two interfaces:

      eth0: 10.30.1.0/24
      eth1: 10.40.1.0/24

      machineNetwork is 10.30.1.0/24
      default route points to 10.30.1.1

      The kubeapi service ip is 172.30.0.1:443

      all Kubernetes services are supposed to be reachable via machineNetwork (10.30.1.0/24)

      To make the kubeapi service ip reachable in hostnetwork, something (openshift installer?) creates a set of nat rules which translates the service ip to the real ip of the nodes which have kubeapi active.

      Initially kubeapi is only active on the bootstrap node so there should be a nat rule like

      172.30.0.1:443 -> 10.30.1.10:6443 (assuming that 10.30.1.10 is the bootstrap nodes' ip address in the machine network)

      However, what we see is
      172.30.0.1:443 -> 10.40.1.10:6443 (which is the bootstrap nodes' eth1 ip address)

      The rule is configured on the controller nodes and lead to asymmetrical routing as the controller sends a packet FROM machineNetwork (10.30.1.x) to 172.30.0.1 which is then translated and forwarded to 10.40.1.10 which then tries to reply back on the 10.40.1.0 network which fails as the request came from 10.30.1.0 network.

      So, we want to understand why openshift installer picks the 10.40.1.x ip address rather than the 10.30.1.x ip for the nat rule. What's the mechanism for getting the ip in case the system has multiple interfaces with ips configured.

      Note: after a while (10-20 minutes) the bootstrap process resets itself and then it picks the correct ip address from the machineNetwork and things start to work.

      — Additional comment from
      smerrow@redhat.com
      on 2022-04-13 13:55:04 UTC —

      Note from Juniper regarding requested SOS report:

      In reference to
      https://bugzilla.redhat.com/show_bug.cgi?id=2070318
      that @Binh Le has been working on. The mustgather was too big to upload for this Bugzilla. Can you access this link?
      https://junipernetworks-my.sharepoint.com/:u:/g/personal/sleigon_juniper_net/ETOrHMqao1tLm10Gmq9rzikB09H5OUwQWZRAuiOvx1nZpQ

      • Making note private to hide partner link

      — Additional comment from
      smerrow@redhat.com
      on 2022-04-21 12:24:33 UTC —

      Can we please get an update on this BZ?

      Do let us know if there is any other information needed.

      — Additional comment from
      trozet@redhat.com
      on 2022-04-21 14:06:00 UTC —

      Can you please provide another link to the sosreport? Looks like the link is dead.

      — Additional comment from
      smerrow@redhat.com
      on 2022-04-21 19:01:39 UTC —

      See mustgather here:
      https://drive.google.com/file/d/16y9IfLAs7rtO-SMphbYBPgSbR4od5hcQ
      — Additional comment from
      trozet@redhat.com
      on 2022-04-21 20:57:24 UTC —

      Looking at the must-gather I think your iptables rules are most likely coming from the fact that kube-proxy is installed:

      [trozet@fedora must-gather.local.288458111102725709]$ omg get pods -n openshift-kube-proxy
      NAME READY STATUS RESTARTS AGE
      openshift-kube-proxy-kmm2p 2/2 Running 0 19h
      openshift-kube-proxy-m2dz7 2/2 Running 0 16h
      openshift-kube-proxy-s9p9g 2/2 Running 1 19h
      openshift-kube-proxy-skrcv 2/2 Running 0 19h
      openshift-kube-proxy-z4kjj 2/2 Running 0 19h

      I'm not sure why this is installed. Is it intentional? I don't see the configuration in CNO to enable kube-proxy. Anyway the node IP detection is done via:
      https://github.com/kubernetes/kubernetes/blob/f173d01c011c3574dea73a6fa3e20b0ab94531bb/cmd/kube-proxy/app/server.go#L844
      Which just looks at the IP of the node. During bare metal install a VIP is chosen and used with keepalived for kubelet to have kapi access. I don't think there is any NAT rule for services until CNO comes up. So I suspect what really is happening is your node IP is changing during install, and kube-proxy is getting deployed (either intentionally or unintentionally) and that is causing the behavior you see. The node IP is chosen via the node ip configuration service:
      https://github.com/openshift/machine-config-operator/blob/da6494c26c643826f44fbc005f26e0dfd10513ae/templates/common/_base/units/nodeip-configuration.service.yaml
      This service will determine the node ip via which interfaces have a default route and which one has the lowest metric. With your 2 interfaces, do they both have default routes? If so, are they using dhcp and perhaps its random which route gets installed with a lower metric?

      — Additional comment from
      trozet@redhat.com
      on 2022-04-21 21:13:15 UTC —

      Correction: looks like standalone kube-proxy is installed by default when the provider is not SDN, OVN, or kuryr so this looks like the correct default behavior for kube-proxy to be deployed.

      — Additional comment from
      lpbinh@gmail.com
      on 2022-04-25 04:05:14 UTC —

      Hi Tim,

      You are right, kube-proxy is deployed by default and we don't change that behavior.

      There is only 1 default route configured for the management interface (10.30.1.x) , we used to have a default route for the data/vrrp interface (10.40.1.x) with higher metric before. As said, we don't have the default route for the second interface any more but still encounter the issue pretty often.

      — Additional comment from
      trozet@redhat.com
      on 2022-04-25 14:24:05 UTC —

      Binh, can you please provide a sosreport for one of the nodes that shows this behavior? Then we can try to figure out what is going on with the interfaces and the node ip service. Thanks.

      — Additional comment from
      trozet@redhat.com
      on 2022-04-25 16:12:04 UTC —

      Actually Ben reminded me that the invalid endpoint is actually the boostrap node itself:
      172.30.0.1:443 -> 10.30.1.10:6443 (assuming that 10.30.1.10 is the bootstrap nodes' ip address in the machine network)

      vs
      172.30.0.1:443 -> 10.40.1.10:6443 (which is the bootstrap nodes' eth1 ip address)

      So maybe a sosreport off that node is necessary? I'm not as familiar with the bare metal install process, moving back to Ben.

      — Additional comment from
      lpbinh@gmail.com
      on 2022-04-26 08:33:45 UTC —

      Created attachment 1875023 [details]sosreport

      — Additional comment from
      lpbinh@gmail.com
      on 2022-04-26 08:34:59 UTC —

      Created attachment 1875024 [details]sosreport-part2

      Hi Tim,

      We observe this issue when deploying clusters using OpenStack instances as our infrastructure is based on OpenStack.

      I followed the steps here to collect the sosreport:
      https://docs.openshift.com/container-platform/4.8/support/gathering-cluster-data.html
      Got the sosreport which is 22MB which exceeds the size permitted (19MB), so I split it to 2 files (xaa and xab), if you can't join them then we will need to put the collected sosreport on a share drive like we did with the must-gather data.

      Here are some notes about the cluster:

      First two control nodes are below, ocp-binhle-8dvald-ctrl-3 is the bootstrap node.

      [core@ocp-binhle-8dvald-ctrl-2 ~]$ oc get node
      NAME STATUS ROLES AGE VERSION
      ocp-binhle-8dvald-ctrl-1 Ready master 14m v1.21.8+ed4d8fd
      ocp-binhle-8dvald-ctrl-2 Ready master 22m v1.21.8+ed4d8fd

      We see the behavior that wrong nat'ing was done at the beginning, then corrected later:

      sh-4.4# nft list table ip nat | grep 172.30.0.1
      meta l4proto tcp ip daddr 172.30.0.1 tcp dport 443 counter packets 3 bytes 180 jump KUBE-SVC-NPX46M4PTMTKRN6Y
      sh-4.4# nft list chain ip nat KUBE-SVC-NPX46M4PTMTKRN6Y
      table ip nat {
      chain KUBE-SVC-NPX46M4PTMTKRN6Y

      { counter packets 3 bytes 180 jump KUBE-SEP-VZ2X7DROOLWBXBJ4 }

      }
      sh-4.4# nft list chain ip nat KUBE-SEP-VZ2X7DROOLWBXBJ4
      table ip nat {
      chain KUBE-SEP-VZ2X7DROOLWBXBJ4

      { ip saddr 10.40.1.7 counter packets 0 bytes 0 jump KUBE-MARK-MASQ meta l4proto tcp counter packets 3 bytes 180 dnat to 10.40.1.7:6443 }

      }
      sh-4.4#
      sh-4.4#
      <....after a while....>
      sh-4.4# nft list chain ip nat KUBE-SVC-NPX46M4PTMTKRN6Y
      table ip nat {
      chain KUBE-SVC-NPX46M4PTMTKRN6Y

      { counter packets 0 bytes 0 jump KUBE-SEP-X33IBTDFOZRR6ONM }
      }
      sh-4.4# nft list table ip nat | grep 172.30.0.1
      meta l4proto tcp ip daddr 172.30.0.1 tcp dport 443 counter packets 0 bytes 0 jump KUBE-SVC-NPX46M4PTMTKRN6Y
      sh-4.4# nft list chain ip nat KUBE-SVC-NPX46M4PTMTKRN6Y
      table ip nat {
      chain KUBE-SVC-NPX46M4PTMTKRN6Y { counter packets 0 bytes 0 jump KUBE-SEP-X33IBTDFOZRR6ONM }

      }
      sh-4.4# nft list chain ip nat KUBE-SEP-X33IBTDFOZRR6ONM
      table ip nat {
      chain KUBE-SEP-X33IBTDFOZRR6ONM

      { ip saddr 10.30.1.7 counter packets 0 bytes 0 jump KUBE-MARK-MASQ meta l4proto tcp counter packets 0 bytes 0 dnat to 10.30.1.7:6443 }

      }
      sh-4.4#

      — Additional comment from
      lpbinh@gmail.com
      on 2022-05-12 17:46:51 UTC —

      @
      trozet@redhat.com
      May we have an update on the fix, or the plan for the fix? Thank you.

      — Additional comment from
      lpbinh@gmail.com
      on 2022-05-18 21:27:45 UTC —

      Created support Case 03223143.

      — Additional comment from
      vkochuku@redhat.com
      on 2022-05-31 16:09:47 UTC —

      Hello Team,

      Any update on this?

      Thanks,
      Vinu K

      — Additional comment from
      smerrow@redhat.com
      on 2022-05-31 17:28:54 UTC —

      This issue is causing delays in Juniper's CI/CD pipeline and makes for a less than ideal user experience for deployments.

      I'm getting a lot of pressure from the partner on this for an update and progress. I've had them open a case [1] to help progress.

      Please let us know if there is any other data needed by Juniper or if there is anything I can do to help move this forward.

      [1]
      https://access.redhat.com/support/cases/#/case/03223143
      — Additional comment from
      vpickard@redhat.com
      on 2022-06-02 22:14:23 UTC —

      @
      bnemec@redhat.com
      Tim mentioned in
      https://bugzilla.redhat.com/show_bug.cgi?id=2070318#c14
      that this issue appears to be at BM install time. Is this something you can help with, or do we need help from the BM install team?

      — Additional comment from
      bnemec@redhat.com
      on 2022-06-03 18:15:17 UTC —

      Sorry, I missed that this came back to me.

      (In reply to Binh Le from
      comment #16
      )> We observe this issue when deploying clusters using OpenStack instances as
      > our infrastructure is based on OpenStack.This does not match the configuration in the must-gathers provided so far, which are baremetal. Are we talking about the same environments?

      I'm currently discussing this with some other internal teams because I'm unfamiliar with this type of bootstrap setup. I need to understand what the intended behavior is before we decide on a path forward.

      — Additional comment from
      rurena@redhat.com
      on 2022-06-06 14:36:54 UTC —

      (In reply to Ben Nemec from
      comment #22
      )> Sorry, I missed that this came back to me.
      >
      > (In reply to Binh Le from comment #16)
      > > We observe this issue when deploying clusters using OpenStack instances as
      > > our infrastructure is based on OpenStack.
      >
      > This does not match the configuration in the must-gathers provided so far,
      > which are baremetal. Are we talking about the same environments?
      >
      > I'm currently discussing this with some other internal teams because I'm
      > unfamiliar with this type of bootstrap setup. I need to understand what the
      > intended behavior is before we decide on a path forward.I spoke to the CU they tell me that all work should be on baremetal. They were probably just testing on OSP and pointing out that they saw the same behavior.

      — Additional comment from
      bnemec@redhat.com
      on 2022-06-06 16:19:37 UTC —

      Okay, I see now that this is an assisted installer deployment. Can we get the cluster ID assigned by AI so we can take a look at the logs on our side? Thanks.

      — Additional comment from
      lpbinh@gmail.com
      on 2022-06-06 16:38:56 UTC —

      Here is the cluster ID, copied from the bug description:
      ClusterID: 24bbde0b-79b3-4ae6-afc5-cb694fa48895

      In regard to your earlier question about OpenStack & baremetal (2022-06-03 18:15:17 UTC):

      We had an issue with platform validation in OpenStack earlier. Host validation was failing with the error message “Platform network settings: Platform OpenStack Compute is allowed only for Single Node OpenShift or user-managed networking.”

      It's found out that there is no platform type "OpenStack" available in [
      https://github.com/openshift/assisted-service/blob/master/models/platform_type.go#L29
      ] so we set "baremetal" as the platform type on our computes. That's the reason why you are seeing baremetal as the platform type.

      Thank you

      — Additional comment from
      ercohen@redhat.com
      on 2022-06-08 08:00:18 UTC —

      Hey, first you are currect, When you set 10.30.1.0/24 as the machine network, the bootstrap process should use the IP on that subnet in the bootstrap node.

      I'm trying to understand how exactly this cluster was installed.
      You are using on-prem deployment of assisted-installer (podman/ACM)?
      You are trying to form a cluster from OpenStack Vms?
      You set the platform to Baremetal where?
      Did you set user-managed-netwroking?

      Some more info, when using OpenStack platform you should install the cluster with user-managed-netwroking.
      And that's what the failing validation is for.

      — Additional comment from
      bnemec@redhat.com
      on 2022-06-08 14:56:53 UTC —

      Moving to the assisted-installer component for further investigation.

      — Additional comment from
      lpbinh@gmail.com
      on 2022-06-09 07:37:54 UTC —

      @Eran Cohen:

      Please see my response inline.

      You are using on-prem deployment of assisted-installer (podman/ACM)?
      --> Yes, we are using on-prem deployment of assisted-installer.

      You are trying to form a cluster from OpenStack Vms?
      --> Yes.

      You set the platform to Baremetal where?
      --> It was set in the Cluster object, Platform field when we model the cluster.

      Did you set user-managed-netwroking?
      --> Yes, we set it to false for VRRP.

      — Additional comment from
      itsoiref@redhat.com
      on 2022-06-09 08:17:23 UTC —

      @
      lpbinh@gmail.com
      can you please share assisted logs that you can download when cluster is failed or installed?
      Will help us to see the full picture

      — Additional comment from
      ercohen@redhat.com
      on 2022-06-09 08:23:18 UTC —

      OK, as noted before when using OpenStack platform you should install the cluster with user-managed-netwroking (set to true).
      Can you explain how you workaround this failing validation? “Platform network settings: Platform OpenStack Compute is allowed only for Single Node OpenShift or user-managed networking.”
      What does this mean exactly? 'we set "baremetal" as the platform type on our computes'

      To be honest I'm surprised that the installation was completed successfully.

      @
      oamizur@redhat.com
      I thought installing on OpenStack VMs with baremetal platform (user-managed-networking=false) will always fail?

      — Additional comment from
      lpbinh@gmail.com
      on 2022-06-10 16:04:56 UTC —

      @
      itsoiref@redhat.com
      : I will reproduce and collect the logs. Is that supposed to be included in the provided must-gather?
      @
      ercohen@redhat.com
      :

      • user-managed-networking set to true when we use external Load Balancer and DNS server. For VRRP we use OpenShift's internal LB and DNS server hence it's set to false, following the doc.
      • As explained OpenShift returns platform type as 'none' for OpenStack:
        https://github.com/openshift/assisted-service/blob/master/models/platform_type.go#L29
        , therefore we set the platformtype as 'baremetal' in the cluster object for provisioning the cluster using OpenStack VMs.

      — Additional comment from
      itsoiref@redhat.com
      on 2022-06-13 13:08:17 UTC —

      @
      lpbinh@gmail.com
      you will have download_logs link in UI. Those logs are not part of must-gather

      — Additional comment from
      lpbinh@gmail.com
      on 2022-06-14 18:52:02 UTC —

      Created attachment 1889993 [details]cluster log per need info request - Cluster ID caa475b0-df04-4c52-8ad9-abfed1509506

      Attached is the cluster log per need info request.
      Cluster ID: caa475b0-df04-4c52-8ad9-abfed1509506
      In this reproduction, the issue is not resolved by OpenShift itself, wrong NAT still remained and cluster deployment failed eventually

      sh-4.4# nft list table ip nat | grep 172.30.0.1
      meta l4proto tcp ip daddr 172.30.0.1 tcp dport 443 counter packets 2 bytes 120 jump KUBE-SVC-NPX46M4PTMTKRN6Y
      sh-4.4# nft list chain ip nat KUBE-SVC-NPX46M4PTMTKRN6Y
      table ip nat {
      chain KUBE-SVC-NPX46M4PTMTKRN6Y

      { counter packets 2 bytes 120 jump KUBE-SEP-VZ2X7DROOLWBXBJ4 }
      }
      sh-4.4# nft list chain ip nat KUBE-SEP-VZ2X7DROOLWBXBJ4; date
      table ip nat {
      chain KUBE-SEP-VZ2X7DROOLWBXBJ4 { ip saddr 10.40.1.7 counter packets 0 bytes 0 jump KUBE-MARK-MASQ meta l4proto tcp counter packets 2 bytes 120 dnat to 10.40.1.7:6443 }
      }
      Tue Jun 14 17:40:19 UTC 2022
      sh-4.4# nft list chain ip nat KUBE-SEP-VZ2X7DROOLWBXBJ4; date
      table ip nat {
      chain KUBE-SEP-VZ2X7DROOLWBXBJ4 { ip saddr 10.40.1.7 counter packets 0 bytes 0 jump KUBE-MARK-MASQ meta l4proto tcp counter packets 2 bytes 120 dnat to 10.40.1.7:6443 }
      }
      Tue Jun 14 17:59:19 UTC 2022
      sh-4.4# nft list chain ip nat KUBE-SEP-VZ2X7DROOLWBXBJ4; date
      table ip nat {
      chain KUBE-SEP-VZ2X7DROOLWBXBJ4 { ip saddr 10.40.1.7 counter packets 0 bytes 0 jump KUBE-MARK-MASQ meta l4proto tcp counter packets 9 bytes 540 dnat to 10.40.1.7:6443 }
      }
      Tue Jun 14 18:17:38 UTC 2022
      sh-4.4#
      sh-4.4#
      sh-4.4# nft list chain ip nat KUBE-SEP-VZ2X7DROOLWBXBJ4; date
      table ip nat {
      chain KUBE-SEP-VZ2X7DROOLWBXBJ4 { ip saddr 10.40.1.7 counter packets 0 bytes 0 jump KUBE-MARK-MASQ meta l4proto tcp counter packets 7 bytes 420 dnat to 10.40.1.7:6443 }
      }
      Tue Jun 14 18:49:28 UTC 2022
      sh-4.4#

      — Additional comment from
      lpbinh@gmail.com
      on 2022-06-14 18:56:06 UTC —

      Created attachment 1889994 [details]cluster log per need info request - Cluster ID caa475b0-df04-4c52-8ad9-abfed1509506

      Please find the cluster-log attached per your request. In this deployment the wrong NAT was not automatically resolved by OpenShift hence the deployment failed eventually.

      sh-4.4# nft list table ip nat | grep 172.30.0.1
      meta l4proto tcp ip daddr 172.30.0.1 tcp dport 443 counter packets 2 bytes 120 jump KUBE-SVC-NPX46M4PTMTKRN6Y
      sh-4.4# nft list chain ip nat KUBE-SVC-NPX46M4PTMTKRN6Y
      table ip nat {
      chain KUBE-SVC-NPX46M4PTMTKRN6Y { counter packets 2 bytes 120 jump KUBE-SEP-VZ2X7DROOLWBXBJ4 }

      }
      sh-4.4# nft list chain ip nat KUBE-SEP-VZ2X7DROOLWBXBJ4; date
      table ip nat {
      chain KUBE-SEP-VZ2X7DROOLWBXBJ4

      { ip saddr 10.40.1.7 counter packets 0 bytes 0 jump KUBE-MARK-MASQ meta l4proto tcp counter packets 2 bytes 120 dnat to 10.40.1.7:6443 }

      }
      Tue Jun 14 17:40:19 UTC 2022
      sh-4.4# nft list chain ip nat KUBE-SEP-VZ2X7DROOLWBXBJ4; date
      table ip nat {
      chain KUBE-SEP-VZ2X7DROOLWBXBJ4

      { ip saddr 10.40.1.7 counter packets 0 bytes 0 jump KUBE-MARK-MASQ meta l4proto tcp counter packets 2 bytes 120 dnat to 10.40.1.7:6443 }

      }
      Tue Jun 14 17:59:19 UTC 2022
      sh-4.4# nft list chain ip nat KUBE-SEP-VZ2X7DROOLWBXBJ4; date
      table ip nat {
      chain KUBE-SEP-VZ2X7DROOLWBXBJ4

      { ip saddr 10.40.1.7 counter packets 0 bytes 0 jump KUBE-MARK-MASQ meta l4proto tcp counter packets 9 bytes 540 dnat to 10.40.1.7:6443 }

      }
      Tue Jun 14 18:17:38 UTC 2022
      sh-4.4#
      sh-4.4#
      sh-4.4# nft list chain ip nat KUBE-SEP-VZ2X7DROOLWBXBJ4; date
      table ip nat {
      chain KUBE-SEP-VZ2X7DROOLWBXBJ4

      { ip saddr 10.40.1.7 counter packets 0 bytes 0 jump KUBE-MARK-MASQ meta l4proto tcp counter packets 7 bytes 420 dnat to 10.40.1.7:6443 }

      }
      Tue Jun 14 18:49:28 UTC 2022
      sh-4.4#

      — Additional comment from
      itsoiref@redhat.com
      on 2022-06-15 15:59:22 UTC —

      @
      lpbinh@gmail.com
      just for the protocol, we don't support baremetal ocp on openstack that's why validation is failing

      — Additional comment from
      lpbinh@gmail.com
      on 2022-06-15 17:47:39 UTC —

      @
      itsoiref@redhat.com
      as explained it's just a workaround on our side to make OCP work in our lab, and from my understanding on OCP perspective it will see that deployment is on baremetal only, not related to OpenStack (please correct me if I am wrong).

      We have been doing thousands of OCP cluster deployments in our automation so far, if it's why validation is failing then it should be failing every time. However it only occurs occasionally when nodes have 2 interfaces, using OCP internal DNS and Load balancer, and sometime resolved by itself and sometime not.

      — Additional comment from
      itsoiref@redhat.com
      on 2022-06-19 17:00:01 UTC —

      For now i can assume that this endpoint is causing the issue:
      {
      "apiVersion": "v1",
      "kind": "Endpoints",
      "metadata": {
      "creationTimestamp": "2022-06-14T17:31:10Z",
      "labels":

      { "endpointslice.kubernetes.io/skip-mirror": "true" }

      ,
      "name": "kubernetes",
      "namespace": "default",
      "resourceVersion": "265",
      "uid": "d8f558be-bb68-44ac-b7c2-85ca7a0fdab3"
      },
      "subsets": [
      {
      "addresses": [

      { "ip": "10.40.1.7" }

      ],
      "ports": [
      {
      "name": "https",
      "port": 6443,
      "protocol": "TCP"
      }
      ]
      }
      ]
      },

      — Additional comment from
      itsoiref@redhat.com
      on 2022-06-21 17:03:51 UTC —

      The issue is that kube-api service advertise wrong ip but it does it cause kubelet chooses the one arbitrary and we currently have no mechanism to set kubelet ip, especially in bootstrap flow.

      — Additional comment from
      lpbinh@gmail.com
      on 2022-06-22 16:07:29 UTC —

      @
      itsoiref@redhat.com
      how do you perform OCP deployment in setups that have multiple interfaces if letting kubelet chooses an interface arbitrary instead of configuring a specific IP address for it to listen on? With what you describe above chance of deployment failure in system with multiple interfaces would be high.

      — Additional comment from
      dhellard@redhat.com
      on 2022-06-24 16:32:26 UTC —

      I set the Customer Escalation flag = Yes, per ACE EN-52253.
      The impact is noted by the RH Account team: "Juniper is pressing and this impacts the Unica Next Project at Telefónica Spain. Unica Next is a critical project for Red Hat. We go live the 1st of July and this issue could impact the go live dates. We need clear information about the status and its possible resolution.

      — Additional comment from
      itsoiref@redhat.com
      on 2022-06-26 07:28:44 UTC —

      I have sent an image with possible fix to Juniper and waiting for their feedback, once they will confirm it works for them we will proceed with the PRs.

      — Additional comment from
      pratshar@redhat.com
      on 2022-06-30 13:26:26 UTC —

      === In Red Hat Customer Portal Case 03223143 ===
      — Comment by Prateeksha Sharma on 6/30/2022 6:56 PM —

      //EMT note//

      Update from our consultant Manuel Martinez Briceno -

      ====
      on 28th June, 2022 the last feedback from Juniper Project Manager and our Partner Manager was that they are testing the fix. They didn't give an Estimate Time to finish, but we will be tracking this closely and let us know of any news.
      ====

      Thanks & Regards,
      Prateeksha Sharma
      Escalation Manager | RHCSA
      Global Support Services, Red Hat

            fpercoco@redhat.com Flavio Percoco (Inactive)
            fpercoco@redhat.com Flavio Percoco (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: