Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-24651

[OVN] EgressIPs' SNATs incorrectly created on unassigned node or don't get recreated once egressIP node assignment changes

XMLWordPrintable

    • Moderate
    • No
    • SDN Sprint 246, SDN Sprint 247, SDN Sprint 248, SDN Sprint 254, SDN Sprint 255, SDN Sprint 256, SDN Sprint 257, SDN Sprint 258
    • 8
    • False
    • Hide

      None

      Show
      None
    • Release Note Not Required
    • In Progress
    • 08/27 not an issue in 4.14. Upstream PR is failing some tests. Two bugs tied to this PR for 4.12, 4.13. Reaching out to Peri and Zshi

      Description of problem:
      A customer that was hit by a previous bug with stale SNATs and duplicated SNATs for egressIPs, has upgraded to 4.12.40 where many fixes have been released to address these issues, but they can still see on some clusters such issues happening in a very alarming high rate.
      Looking at one example we have this egressIP:

       apiVersion: k8s.ovn.org/v1
      kind: EgressIP
      metadata:
        creationTimestamp: "2023-06-20T14:10:09Z"
        generation: 80
        name: intm-pcpos-mbb.egressip
        resourceVersion: "836331030"
        uid: 8d7d3090-a28d-47e1-aa89-673f4d30c944
      spec:
        egressIPs:
        - 172.18.159.22
        namespaceSelector:
          matchLabels:
            kubernetes.io/metadata.name: intm-pcpos-mbb
        podSelector: {}
      status:
        items:
        - egressIP: 172.18.159.22
          node: iepumnosw602.epu.corpintra.net

      When we look at the NBDB there is no single SNAT created for logical port on the GR_iepumnosw602.epu.corpintra.net:

      $ sudo ovn-nbctl find nat external_ip=172.18.159.22
      _uuid : 16e03a1e-b303-4a34-ab32-63bd7335a92b
      allowed_ext_ips : []
      exempted_ext_ips : []
      external_ids : {name=intm-pcpos-mbb.egressip}
      external_ip : "172.18.159.22"
      external_mac : []
      external_port_range : ""
      gateway_port : []
      logical_ip : "10.241.5.12"
      logical_port : k8s-iepumnosw601.epu.corpintra.net
      options : {stateless="false"}
      type : snat

      _uuid : e94bc46e-bf6f-4c0f-9cad-40912dd21681
      allowed_ext_ips : []
      exempted_ext_ips : []
      external_ids : {name=intm-pcpos-mbb.egressip}
      external_ip : "172.18.159.22"
      external_mac : []
      external_port_range : ""
      gateway_port : []
      logical_ip : "10.243.0.40"
      logical_port : k8s-iepumnosw601.epu.corpintra.net
      options : {stateless="false"}
      type : snat

      _uuid : 045530e3-92da-49b1-8e03-eca31504515f
      allowed_ext_ips : []
      exempted_ext_ips : []
      external_ids : {name=intm-pcpos-mbb.egressip}
      external_ip : "172.18.159.22"
      external_mac : []
      external_port_range : ""
      gateway_port : []
      logical_ip : "10.241.2.134"
      logical_port : k8s-iepumnosw601.epu.corpintra.net
      options : {stateless="false"}
      type : snat

      _uuid : af886107-3fae-4671-9fe3-abc11b9cae78
      allowed_ext_ips : []
      exempted_ext_ips : []
      external_ids : {name=intm-pcpos-mbb.egressip}
      external_ip : "172.18.159.22"
      external_mac : []
      external_port_range : ""
      gateway_port : []
      logical_ip : "10.241.2.137"
      logical_port : k8s-iepumnosw601.epu.corpintra.net
      options : {stateless="false"}
      type : snat

      _uuid : 377d695a-da42-4274-b0ab-2ae72d5e75ba
      allowed_ext_ips : []
      exempted_ext_ips : []
      external_ids : {name=intm-pcpos-mbb.egressip}
      external_ip : "172.18.159.22"
      external_mac : []
      external_port_range : ""
      gateway_port : []
      logical_ip : "10.243.0.41"
      logical_port : k8s-iepumnosw601.epu.corpintra.net
      options : {stateless="false"}
      type : snat

      _uuid : 4d8ae9c0-e442-4747-82da-f88c03c87a4e
      allowed_ext_ips : []
      exempted_ext_ips : []
      external_ids : {name=intm-pcpos-mbb.egressip}
      external_ip : "172.18.159.22"
      external_mac : []
      external_port_range : ""
      gateway_port : []
      logical_ip : "10.240.4.55"
      logical_port : k8s-iepumnosw601.epu.corpintra.net
      options : {stateless="false"}
      type : snat

      _uuid : 5e919dfe-4d07-4cc9-88a3-070829e10d19
      allowed_ext_ips : []
      exempted_ext_ips : []
      external_ids : {name=intm-pcpos-mbb.egressip}
      external_ip : "172.18.159.22"
      external_mac : []
      external_port_range : ""
      gateway_port : []
      logical_ip : "10.240.4.54"
      logical_port : k8s-iepumnosw601.epu.corpintra.net
      options : {stateless="false"}
      type : snat

      Another thing that we see is that the SNATs only got created for a few pods and some pods in the project simply didn't get any egressIP SNAT. In total on this project there are these logical port bindings created:

      addresses : ["0a:58:0a:f3:00:29 10.243.0.41"]
      name : intm-pcpos-mbb_mbb-apropos-service-78586dbfc6-k8sw6
      options : {iface-id-ver="759b1468-c1e9-4255-ace1-3db26a487af3", requested-chassis=iepumnosw601.epu.corpintra.net}

      addresses : ["0a:58:0a:f1:02:87 10.241.2.135"]
      name : intm-pcpos-mbb_mbb-antrag-neu-frontend-56d9b54c57-ccglw
      options : {iface-id-ver="b2e65c4b-eb2e-4f97-88aa-10f0b08c3406", requested-chassis=iepumnosw604.epu.corpintra.net}

      addresses : ["0a:58:0a:f0:04:37 10.240.4.55"]
      name : intm-pcpos-mbb_mbb-antrag-verteiler-service-b79cbb8cb-ftkdg
      options : {iface-id-ver="f8183495-c74f-4b4d-9ed5-837ff4173a2b", requested-chassis=iepumnosw602.epu.corpintra.net}

      addresses : ["0a:58:0a:f3:00:2a 10.243.0.42"]
      name : intm-pcpos-mbb_pcneo-maintenance-page-564cf7b54b-xcrxz
      options : {iface-id-ver="28bc2d7a-2106-4118-b16a-215e043c6bd3", requested-chassis=iepumnosw601.epu.corpintra.net}

      addresses : ["0a:58:0a:f1:04:49 10.241.4.73"]
      name : intm-pcpos-mbb_mbb-antrag-gebraucht-frontend-85d9664d85-rcp79
      options : {iface-id-ver="9e9bb50d-30b8-47a1-9508-232479f163cb", requested-chassis=iepumnosw605.epu.corpintra.net}

      addresses : ["0a:58:0a:f1:02:85 10.241.2.133"]
      name : intm-pcpos-mbb_mbb-antrag-neu-b2b-frontend-947b77f78-xfnvc
      options : {iface-id-ver="6e74e70b-8751-4ee3-b11e-c515d6a791f2", requested-chassis=iepumnosw604.epu.corpintra.net}

      addresses : ["0a:58:0a:f1:03:12 10.241.3.18"]
      name : intm-pcpos-mbb_kong-6dff54c998-5zbc5
      options : {iface-id-ver="5c25bcdf-0aab-4bbe-b5c8-afed608e13df", requested-chassis=iepumnosw604.epu.corpintra.net}

      addresses : ["0a:58:0a:f3:00:dd 10.243.0.221"]
      name : intm-pcpos-mbb_mbb-antrag-gebraucht-frontend-85d9664d85-g9t44
      options : {iface-id-ver="1f95ec7e-9e75-4527-8c10-0d782aaa6bfd", requested-chassis=iepumnosw601.epu.corpintra.net}

      addresses : ["0a:58:0a:f0:04:36 10.240.4.54"]
      name : intm-pcpos-mbb_mbb-antrag-gebraucht-service-559b766bdd-g9rxb
      options : {iface-id-ver="55470a8b-1811-4d44-a79e-d27148fe1568", requested-chassis=iepumnosw602.epu.corpintra.net}

      addresses : ["0a:58:0a:f1:02:86 10.241.2.134"]
      name : intm-pcpos-mbb_mbb-antrag-neu-b2b-service-54575fcd89-wd8vr
      options : {iface-id-ver="60c4676a-494b-4b1d-b370-993160d1c198", requested-chassis=iepumnosw604.epu.corpintra.net}

      addresses : ["0a:58:0a:f3:00:c0 10.243.0.192"]
      name : intm-pcpos-mbb_mbb-antrag-gebraucht-service-559b766bdd-pf9lf
      options : {iface-id-ver="66924473-ae8d-47fe-9cda-617fb49d6759", requested-chassis=iepumnosw601.epu.corpintra.net}

      addresses : ["0a:58:0a:f1:05:0a 10.241.5.10"]
      name : intm-pcpos-mbb_mbb-antrag-neu-b2b-service-54575fcd89-vcklw
      options : {iface-id-ver="1677ea45-f967-480d-a406-b443cb807155", requested-chassis=iepumnosw605.epu.corpintra.net}

      addresses : ["0a:58:0a:f1:05:0c 10.241.5.12"]
      name : intm-pcpos-mbb_vin-service-c7bccfdc6-p9clr
      options : {iface-id-ver="33ceadea-678f-4ace-a0fb-2480d25aed3b", requested-chassis=iepumnosw605.epu.corpintra.net}

      addresses : ["0a:58:0a:f1:05:0b 10.241.5.11"]
      name : intm-pcpos-mbb_mbb-apropos-service-78586dbfc6-pxp6s
      options : {iface-id-ver="a8320aa0-5cb4-4728-a72a-0dc3d2395db4", requested-chassis=iepumnosw605.epu.corpintra.net}

      addresses : ["0a:58:0a:f3:00:28 10.243.0.40"]
      name : intm-pcpos-mbb_mbb-antrag-neu-frontend-56d9b54c57-9gkx8
      options : {iface-id-ver="ca3c25a7-14ed-4c51-b5e2-58087c5551c8", requested-chassis=iepumnosw601.epu.corpintra.net}

      addresses : ["0a:58:0a:f3:03:23 10.243.3.35"]
      name : intm-pcpos-mbb_mbb-antrag-neu-service-649f9fb74-zw76x
      options : {iface-id-ver="ee57e7e2-ef8c-4db2-9202-187dce1233f3", requested-chassis=iepumnosw603.epu.corpintra.net}

      addresses : ["0a:58:0a:f3:00:2c 10.243.0.44"]
      name : intm-pcpos-mbb_vin-service-c7bccfdc6-zkdxc
      options : {iface-id-ver="b244f3e1-018c-4d6e-b576-57285d718376", requested-chassis=iepumnosw601.epu.corpintra.net}

      addresses : ["0a:58:0a:f3:00:4e 10.243.0.78"]
      name : intm-pcpos-mbb_pos-monitor-proxy-7c5cf95786-94c79
      options : {iface-id-ver="56e302ba-1b0f-4efd-aa45-85836e326252", requested-chassis=iepumnosw601.epu.corpintra.net}

      addresses : ["0a:58:0a:f1:02:88 10.241.2.136"]
      name : intm-pcpos-mbb_mbb-antrag-neu-service-649f9fb74-tmc5g
      options : {iface-id-ver="e66cd183-b203-43a9-b504-1a5cc752c3c8", requested-chassis=iepumnosw604.epu.corpintra.net}

      addresses : ["0a:58:0a:f0:04:1b 10.240.4.27"]
      name : intm-pcpos-mbb_pos-monitor-proxy-7c5cf95786-tmh46
      options : {iface-id-ver="77cc2e0a-c3ef-4c2a-9b99-3726ad553e68", requested-chassis=iepumnosw602.epu.corpintra.net}

      addresses : ["0a:58:0a:f1:03:0b 10.241.3.11"]
      name : intm-pcpos-mbb_kong-postgres-58c6bc559c-dj9z6
      options : {iface-id-ver="fcc8edda-9bd4-49d9-929e-9b149b912028", requested-chassis=iepumnosw604.epu.corpintra.net}

      addresses : ["0a:58:0a:f3:00:27 10.243.0.39"]
      name : intm-pcpos-mbb_mbb-antrag-neu-b2b-frontend-947b77f78-j6tz9
      options : {iface-id-ver="d3e08cdb-a757-48f8-9b1b-44d9f51a4275", requested-chassis=iepumnosw601.epu.corpintra.net}

      addresses : ["0a:58:0a:f1:02:89 10.241.2.137"]
      name : intm-pcpos-mbb_mbb-antrag-verteiler-service-b79cbb8cb-n99fh
      options : {iface-id-ver="c34af7ad-81f9-4f26-a73b-df04fcd4cb27", requested-chassis=iepumnosw604.epu.corpintra.net}

      It looks like the ovnkube-master just gave up and didn't do any change or recheck.

      I will share the must-gathers soon.

      Version-Release number of selected component (if applicable):
      4.12.40
           
      How reproducible:
      Often on the customer only
           
      Steps to Reproduce:
      Unnown

            pepalani@redhat.com Periyasamy Palanisamy
            rhn-support-andcosta Andre Costa
            Huiran Wang Huiran Wang
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: