-
Spike
-
Resolution: Done
-
Critical
-
None
-
None
-
None
-
False
-
None
-
False
-
---
-
-
-
0
-
0
Which 4.y.z to 4.y'.z' updates increase vulnerability?
We identified this problem when upgrading cluster with IPsec enabled from 4.14 to 4.15, But this problem may exist on a fresh cluster when we enable IPsec on 4.14 or even on the previous versions.
Which types of clusters?
This problem is currently found only on the vSphere cluster in which we have bond interface is used as an primary interface which has IPsec hardware offload enabled.
What is the impact? Is it serious enough to warrant removing update recommendations?
This would make east west traffic for the cluster is entirely broken and cluster becomes unusable until we disable IPsec from networks operator config.
How involved is remediation?
If the user still wants to have IPsec enabled for cluster, they can still restore the cluster with disabling esp-tx-csum-hw-offload on bond and its slave interface.
For example, this can be done by rolling out the following machine configs.
for role in master worker; do cat >> "${SHARED_DIR}/manifest_${role}-esp-csum-disable.yml" <<-EOF apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: labels: machineconfiguration.openshift.io/role: $role name: 80-$role-esp-csum-disable spec: config: ignition: version: 3.2.0 systemd: units: - name: disable.esp.csum.service enabled: true contents: | [Unit] Description=Disable ESP csum hw offload After=ovs-configuration.service Before=kubelet-dependencies.target node-valid-hostname.service [Service] Type=oneshot ExecStart=/usr/bin/bash -x -c "ethtool -K bond0 esp-tx-csum-hw-offload off && ethtool -K ens192 esp-tx-csum-hw-offload off && ethtool -K ens224 esp-tx-csum-hw-offload off" StandardOutput=journal+console StandardError=journal+console [Install] WantedBy=network-online.target EOF done
This procedure needs cluster reboot, so it takes few minutes to restore the cluster.
Is this a regression?
I don't see a statement anywhere which states OCP supports IPsec hardware offload, hence this may not be a regression issue.
- blocks
-
OCPBUGS-25312 [OVN][IPSEC EW]Upgrade from 4.14->4.15 failed for Vsphere
- Closed
- is duplicated by
-
SDN-4482 Impact statement request for OCPBUGS-22185 [OVN IPsec]One master node cannot access the pod on one worker node
- Closed