Uploaded image for project: 'OpenShift Core Networking'
  1. OpenShift Core Networking
  2. CORENET-3894

OpenShift North-South IPsec Implementation Enhancement and GA

    • OpenShift North-South IPsec Implementation
    • BU Product Work
    • False
    • None
    • False
    • Red
    • In Progress
    • OCPSTRAT-937 - OVN IPSec support between an OCP cluster and an external provider [N-S] - GA
    • 0% To Do, 0% In Progress, 100% Done
    • Hide

      Feb 19,2024

      QE:

      • Workaround tested for bug OCPBUGS-27839 to be included in RN. Functional QE is done with testing at this point. Waiting for scale team inputs if they can comment before this epic moves to RELEASE_PENDING

      Feb 15,2024

      QE

      • Based on further testing, QE has identified OCPBUGS-27839 as a RHEL/nmstate bug rather than OCP. We need a workaround to document regarding restoring network connectivity issue between hosts.

      Feb 13, 2024

      QE

      • IPsec EW blocker has been verified. QE just waiting for dev inputs on OCPBUGS-27839: [IPSEC] Restarting ipsec service/ovn-ipsec-host pod would cause ipsec traffic broken between worker node and external host

       

      Feb 6, 2024

      QE

      Feb 1, 2024

      QE

      • All required critical fixes are in prod builds now. QE have verified crticial fixes and testing looks green now. Regression issue of RHEL nodes EW has been verified along with pre-merge. Few non-critical bugs under post-merge task SDN-4200 still needs dev triage.

      Jan 24, 2024

      QE

      • QE has tested nmstate rpms and corresponding IPsec configs on GCP with nmstate (custom rpms) in last couple of weeks which looks good with exception of few bugs mentioned in SDN-4200: post-merge testing. QE still waiting for `nmstate-2.2.23-1.el9_2` to land in RHCOS along with rpms (with few fixes). QE will do additional testing once all required rpms  lands in production builds.

       

      Jan 16, 2024

      QE

      • Testing continued using scratch builds for nmstate, NetworkManager-libreswan.
      • Unable to test transport mode due to RHEL-21532: Cannot configure ipsec mode with 'type' in nmstate config
      • Upgrade from 4.14 OCPBUGS-26952: no ipsec on cluster post NS mc's deletion during ipsecConfig mode `Full`
      • Above bugs can be verified with scratch builds once available.
      • Plan: Once the build is available in OCP, QE will run regression tests with ipsec enabled.

       

      Jan 09, 2024

      QE

      • TestBlocker bug RHEL-20690: Cannot setup point-to-point ipsec tunnel in-progress
      • Pre-merge testing complete cluster-network-operator/pull/2144 for IPsec runtime modes Full and external. Looks good on standalone build.
      • Full and External runtime modes are not working if cluster is upgraded from 4.14 to bot build (with 2144).
      • Hypershift is out of scope for this feature.

       

      Jan 05, 2024

      QE

      • Testing blocked due to RHEL-20690: Cannot setup point-to-point ipsec tunnel

      โ€”

      API PR is ready to merge
      CNO PR is in code-review and likely to merge soon
      must-gather work has not started yet

      Show
      Feb 19,2024 QE: Workaround tested for bug OCPBUGS-27839 to be included in RN. Functional QE is done with testing at this point. Waiting for scale team inputs if they can comment before this epic moves to RELEASE_PENDING Feb 15,2024 QE Based on further testing, QE has identified OCPBUGS-27839 as a RHEL/nmstate bug rather than OCP. We need a workaround to document regarding restoring network connectivity issue between hosts. Feb 13, 2024 QE IPsec EW blocker has been verified. QE just waiting for dev inputs on OCPBUGS-27839 : [IPSEC] Restarting ipsec service/ovn-ipsec-host pod would cause ipsec traffic broken between worker node and external host   Feb 6, 2024 QE https://issues.redhat.com/browse/OCPBUGS-28676 is release blocker now. Libreswan package needs to be installed on RHEL node hosts by openshift-ansible which is not part of the OCP payload, but is delivered via rpms. This is being addressed via https://github.com/openshift/openshift-ansible/pull/12477 and https://github.com/openshift/openshift-ansible/pull/12478 . QE need to test this hack by either execute `sudo dnf install libreswan` on ansible pre-hook playbooks during RHEL nodes update or manually on RHEL node before pre-hook.  Fixes should be validated by Feb 9 to make it in RC Feb 1, 2024 QE All required critical fixes are in prod builds now. QE have verified crticial fixes and testing looks green now. Regression issue of RHEL nodes EW has been verified along with pre-merge. Few non-critical bugs under post-merge task SDN-4200 still needs dev triage. Jan 24, 2024 QE QE has tested nmstate rpms and corresponding IPsec configs on GCP with nmstate (custom rpms) in last couple of weeks which looks good with exception of few bugs mentioned in SDN-4200 : post-merge testing. QE still waiting for `nmstate-2.2.23-1.el9_2` to land in RHCOS along with rpms (with few fixes). QE will do additional testing once all required rpms  lands in production builds.   Jan 16, 2024 QE Testing continued using scratch builds for nmstate, NetworkManager-libreswan. Unable to test transport mode due to  RHEL-21532 : Cannot configure ipsec mode with 'type' in nmstate config Upgrade from 4.14 OCPBUGS-26952 : no ipsec on cluster post NS mc's deletion during ipsecConfig mode `Full` Above bugs can be verified with scratch builds once available. Plan: Once the build is available in OCP, QE will run regression tests with ipsec enabled.   Jan 09, 2024 QE TestBlocker bug  RHEL-20690 : Cannot setup point-to-point ipsec tunnel in-progress Pre-merge testing complete cluster-network-operator/pull/2144  for IPsec runtime modes Full and external. Looks good on standalone build. Full and External runtime modes are not working if cluster is upgraded from 4.14 to bot build (with 2144). Hypershift is out of scope for this feature.   Jan 05, 2024 QE Testing blocked due to RHEL-20690 : Cannot setup point-to-point ipsec tunnel โ€” API PR is ready to merge CNO PR is in code-review and likely to merge soon must-gather work has not started yet
    • ---
    • 0

      Epic Goal

      • Add an API extension for North-South IPsec.
      • close gaps from SDN-3604 - mainly around upgrade
      • add telemetry

      Why is this important?

      • without API, customers are forced to use MCO. this brings with it a set of limitations (mainly reboot per change and the fact that config is shared among each pool, can't do per node configuration)
      • better upgrade solution will give us the ability to support a single host based implementation
      • telemetry will give us more info on how widely is ipsec used.

      Acceptance Criteria

      • CI - MUST be running successfully with tests automated
      • Release Technical Enablement - Provide necessary release enablement details and documents.
      • Must allow for the possibility of offloading the IPsec encryption to a SmartNIC.

       

      • nmstate
      • k8s-nmstate
      • easier mechanism for cert injection (??)
      • telemetry
      •  

      Dependencies (internal and external)

      1.  

      Related:

      • ITUP-44 - OpenShift support for North-South OVN IPSec
      • HATSTRAT-33 - Encrypt All Traffic to/from Cluster (aka IPSec as a Service)

      Previous Work (Optional):

      1. SDN-717 - Support IPSEC on ovn-kubernetes
      2. SDN-3604 - Fully supported non-GA N-S IPSec implementation using machine config.

      Open questions::

      1. โ€ฆ

      Done Checklist

      • CI - CI is running, tests are automated and merged.
      • Release Enablement <link to Feature Enablement Presentation>
      • DEV - Upstream code and tests merged: <link to meaningful PR or GitHub Issue>
      • DEV - Upstream documentation merged: <link to meaningful PR or GitHub Issue>
      • DEV - Downstream build attached to advisory: <link to errata>
      • QE - Test plans in Polarion: <link or reference to Polarion>
      • QE - Automated tests merged: <link or reference to automated tests>
      • DOC - Downstream documentation merged: <link to meaningful PR>

            [CORENET-3894] OpenShift North-South IPsec Implementation Enhancement and GA

            Aniket Bhat added a comment -

            TE session for IPSec N-S feature.

            Aniket Bhat added a comment - TE session for IPSec N-S feature.

            Anurag Saxena added a comment - - edited

            QE/Dev/PM meeting minutes today on GCP support for nmstate:

            1) nmstate is only supported on BM, Vsphere and OSP but nmstate support needs to be delivered on GCP as well. 

            In 4.15 QE can contain testing the nmstate operator wrt the IPSec use case only on GCP however this needs to be documented clearly that only IPSec related configs or limited usecases are supported on GCP nmstate to clearly mention this to customers. 

            Some background: QE has tested nmstate rpms and corresponding IPsec configs on GCP with nmstate (custom rpms) in last couple of weeks which looks good with exception of few bugs mentioned in post-merge task. QE still waiting for `nmstate-2.2.23-1.el9_2` to land in RHCOS along with rpms (with few fixes). QE will do additional testing once all required rpms  lands in production builds.

             

            2) Dev will lay out plan to support nmstate fully on cloud platforms like GCP in z stream or 4.16 which is yet to be decided.

            ykashtan Please add anything I missed. cc huirwang grajaiya@redhat.com mifiedle@redhat.com anbhat 

            Anurag Saxena added a comment - - edited QE/Dev/PM meeting minutes today on GCP support for nmstate: 1) nmstate is only supported on BM, Vsphere and OSP but nmstate support needs to be delivered on GCP as well.  In 4.15 QE can contain testing the nmstate operator wrt the IPSec use case only on GCP however this needs to be documented clearly that only IPSec related configs or limited usecases are supported on GCP nmstate to clearly mention this to customers.  Some background: QE has tested nmstate rpms and corresponding IPsec configs on GCP with nmstate (custom rpms) in last couple of weeks which looks good with exception of few bugs mentioned in post-merge task. QE still waiting for `nmstate-2.2.23-1.el9_2` to land in RHCOS along with rpms (with few fixes). QE will do additional testing once all required rpms  lands in production builds.   2) Dev will lay out plan to support nmstate fully on cloud platforms like GCP in z stream or 4.16 which is yet to be decided. ykashtan Please add anything I missed. cc huirwang grajaiya@redhat.com mifiedle@redhat.com anbhat  

            As there is still active dev work for 4.15, moving this back to In Progress

            Mike Fiedler added a comment - As there is still active dev work for 4.15, moving this back to In Progress

            zshi@redhat.com

            Hi, the status of this Epic is 'Red', however, the target version is openshift-4.15. Could you please confirm the plan here?

             

            cc: rhn-support-mfiedler 

            Gowrishankar Rajaiyan added a comment - zshi@redhat.com ,  Hi, the status of this Epic is 'Red', however, the target version is openshift-4.15. Could you please confirm the plan here?   cc: rhn-support-mfiedler  

            ddharwar@redhat.com
            Currently our implementation is excluding hypershift
            ie - we wont support N-S on hypershift cluster becasue there's no MachineConfig for the control-plane.

            potentially we can open a new future feature to cover that gap.
            but do we want to?

            Yuval Kashtan added a comment - ddharwar@redhat.com Currently our implementation is excluding hypershift ie - we wont support N-S on hypershift cluster becasue there's no MachineConfig for the control-plane. potentially we can open a new future feature to cover that gap. but do we want to?

            IDK why it was closed, it was ready for merge!
            I'll reopen

            Yuval Kashtan added a comment - IDK why it was closed, it was ready for merge! I'll reopen

            So is this GA in 4.15, even if additional enhancements are not complete? Thanks!

            Jason Boxman added a comment - So is this GA in 4.15, even if additional enhancements are not complete? Thanks!

            sizing:
            -------
            M
            and with optional stories: L

            Yuval Kashtan added a comment - sizing: ------- M and with optional stories: L

              ykashtan Yuval Kashtan
              mcurry@redhat.com Marc Curry
              Huiran Wang Huiran Wang
              Jason Boxman Jason Boxman
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

                Created:
                Updated:
                Resolved: