-
Bug
-
Resolution: Duplicate
-
Major
-
None
-
4.14
-
None
-
No
-
False
-
-
Description of problem:
Installing or upgrading to OCP 4.14 on platforms that allow keepalived-managed VIPs for API and Ingress can fail because of the new "<interface-name>:vip" label applied by /etc/kubernetes/static-pod-resources/keepalived/keepalived.conf.tmpl The label was added by https://issues.redhat.com/browse/OCPBUGS-4370 The man page reveals that labels are restricted to 15 characters - https://man7.org/linux/man-pages/man8/ip-address.8.html There is no check or safety in place that prevents long interface names like enp33s0f0np0 from exceeding the length when the :vip suffix is added. keepalived will fail to apply the VIP and report this error: Netlink: error: Numerical result out of range(34), type=RTM_NEWADDR(20), seq=1700238417, pid=0
Version-Release number of selected component (if applicable):
4.14
How reproducible:
Install or upgrade to OCP 4.14 on a baremetal host that has a long interface name. Or use a custom interface name like "bridge-internal" Several customers of mine have created bridge interfaces like this in order to allow VMs to share OpenShift primary interface.
Steps to Reproduce:
[laptop]$ ssh core@ocp-node-1 Red Hat Enterprise Linux CoreOS 414.92.202311061957-0 [core@ocp-node-1 ~]$ ip -brief -4 a s lo UNKNOWN 127.0.0.1/8 tun0 UNKNOWN 10.130.0.1/23 enp33s0f0np0 UP 10.15.168.23/24 ### API & Ingress VIPs are missing here [core@ocp-node-1 ~]$ echo -n "enp33s0f0np0:vip" | wc -c 16 [core@ocp-node-1 ~]$ sudo crictl logs $(sudo crictl ps --quiet --label io.kubernetes.container.name=keepalived) 2>&1 | grep error Fri Nov 17 16:39:52 2023: Netlink: error: Numerical result out of range(34), type=RTM_NEWADDR(20), seq=1700238417, pid=0 [core@ocp-node-1 ~]$ grep label /etc/kubernetes/static-pod-resources/keepalived/keepalived.conf.tmpl {{ .Cluster.APIVIP }}/{{ .Cluster.VIPNetmask }} label {{ .VRRPInterface }}:vip {{ .Cluster.IngressVIP }}/{{ .Cluster.VIPNetmask }} label {{ .VRRPInterface }}:vip [core@ocp-node-1 ~]$ grep label /etc/keepalived/keepalived.conf 10.15.168.68/32 label enp33s0f0np0:vip 10.15.168.69/32 label enp33s0f0np0:vip
Actual results:
keepalived fails to assign the VIP to the interface and the installation or upgrade is halted
Expected results:
The openshift-install, baremetal-runtimecfg, and/or machine-config-operator should check if the label will exceed 15 characters and reduce it's length if required.
Additional info:
https://issues.redhat.com/browse/OCPBUGS-4370
https://github.com/ovn-org/ovn-kubernetes/pull/3552/files
https://github.com/openshift/baremetal-runtimecfg/pull/236
https://github.com/openshift/machine-config-operator/pull/3683
https://github.com/openshift/ovn-kubernetes/pull/1697
- duplicates
-
OCPBUGS-23432 OCP installation its failing because VIP is not being allocated to the bootstrap node
- Closed