-
Bug
-
Resolution: Done-Errata
-
Major
-
4.15, 4.16, 4.17, 4.18
-
Quality / Stability / Reliability
-
False
-
-
None
-
Moderate
-
None
-
None
-
None
-
None
-
Done
-
Bug Fix
-
-
None
-
None
-
None
-
None
Description of problem:
When customers create install-config.yaml and process it, if the NetworkType is set as "OVNkubernetes" (lowercase k) it allows to create manifests and ignition without any problem but it carries unexpected outcomes.
Version-Release number of selected component (if applicable):
tested in 4.15.z, 4.16.z, 4.17.z, 4.18.z
How reproducible:
Create a manifest having OVNkubernetes instead of OVNKubernetes as networkType:
..
networking:
clusterNetworks:
- cidr: 172.18.0.0/16
hostPrefix: 23
networkType: OVNkubernetes
machineNetwork:
- cidr: 192.168.0.0/16
serviceNetwork:
- 172.50.0.0/16
..
Create manifests and ignition and follow the normal installation for UPI/Agnostic.
After some time bootstrap comes up, kube-api, and machine-config-server come up normally. Master nodes come up without issues but the consequences start to show:
1.- The manifest for master nodes exposed by machine-config-server contains the ovs-configuration.service as disabled by default:
--
{
"contents": "[Unit]\n# Kdump will generate it's initramfs based on the running state when kdump.service run\n# If OVS has already run, the kdump fails to gather a working network config,\n# which prevent network log exports, sush as SSH.\n# See https://issues.redhat.com/browse/OCPBUGS-28239\nAfter=kdump.service\nDescription=Configures OVS with proper host networking configuration\n# This service is used to move a physical NIC into OVS and reconfigure OVS to use the host IP\nRequires=openvswitch.service\nWants=NetworkManager-wait-online.service\nAfter=firstboot-osupdate.target\nAfter=NetworkManager-wait-online.service openvswitch.service network.service nodeip-configuration.service nmstate.service\nBefore=kubelet-dependencies.target node-valid-hostname.service\n\n[Service]\n# Need oneshot to delay kubelet\nType=oneshot\nExecStart=/usr/local/bin/configure-ovs.sh OVNkubernetes\nStandardOutput=journal+console\nStandardError=journal+console\n\n[Install]\nRequiredBy=kubelet-dependencies.target\n",
"enabled": false, <<<
"name": "ovs-configuration.service"
},
--
2.- This in turn causes that master nodes doesn't get the br-ex bridge configured.
3.- In turn this causes ovn-controller to crash in nodes since br-ex is not found:
--
F0403 18:57:28.192030 4448 ovnkube.go:137] failed to run ovnkube: [failed to start network controller: failed to start default network controller: unable to create admin network policy controller, err: could not add Event Handler for anpInformer during admin network policy controller initialization, handler {0x1fcc6e0 0x1fcc3c0 0x1fcc360} was not added to shared informer because it has stopped already, failed to start node network controller: failed to start default node network controller: error looking up gw interface: "br-ex", error: Link not found]
--
3.- As the systemd service is incorrect, even if its manually enabled, the related script doesn't match the sdn and returns 0, exiting the loop:
--
ExecStart=/usr/local/bin/configure-ovs.sh OVNkubernetes
Apr 03 19:01:32 master0-upi.testlab.local configure-ovs.sh[5519]: + '[' OVNkubernetes == OVNKubernetes ']'
Apr 03 19:01:32 master0-upi.testlab.local configure-ovs.sh[5519]: + '[' OVNkubernetes == OpenShiftSDN ']'
Apr 03 19:01:32 master0-upi.testlab.local configure-ovs.sh[5519]: ++ handle_exit
Apr 03 19:01:32 master0-upi.testlab.local configure-ovs.sh[5519]: ++ e=0
4.- As master nodes never become Ready, certificates expires and the installation fails.
Steps to Reproduce:
1. Create manifests and ignition with OVNkubernetes as network type.
2. Deploy an UPI cluster.
Actual results:
Cluster never deploys.
Expected results:
It shouldn't allow to render manifests since comparing the word as case sensitive, it doesn't match OVNKubernetes.
Additional info:
- links to
-
RHEA-2024:11038
OpenShift Container Platform 4.19.z bug fix update