-
Bug
-
Resolution: Done-Errata
-
Major
-
4.15, 4.16, 4.17, 4.18
-
Quality / Stability / Reliability
-
False
-
-
None
-
Moderate
-
None
-
None
-
None
-
None
-
Done
-
Bug Fix
-
-
None
-
None
-
None
-
None
Description of problem:
When customers create install-config.yaml and process it, if the NetworkType is set as "OVNkubernetes" (lowercase k) it allows to create manifests and ignition without any problem but it carries unexpected outcomes.
Version-Release number of selected component (if applicable):
tested in 4.15.z, 4.16.z, 4.17.z, 4.18.z
How reproducible:
Create a manifest having OVNkubernetes instead of OVNKubernetes as networkType: .. networking: clusterNetworks: - cidr: 172.18.0.0/16 hostPrefix: 23 networkType: OVNkubernetes machineNetwork: - cidr: 192.168.0.0/16 serviceNetwork: - 172.50.0.0/16 .. Create manifests and ignition and follow the normal installation for UPI/Agnostic. After some time bootstrap comes up, kube-api, and machine-config-server come up normally. Master nodes come up without issues but the consequences start to show: 1.- The manifest for master nodes exposed by machine-config-server contains the ovs-configuration.service as disabled by default: -- { "contents": "[Unit]\n# Kdump will generate it's initramfs based on the running state when kdump.service run\n# If OVS has already run, the kdump fails to gather a working network config,\n# which prevent network log exports, sush as SSH.\n# See https://issues.redhat.com/browse/OCPBUGS-28239\nAfter=kdump.service\nDescription=Configures OVS with proper host networking configuration\n# This service is used to move a physical NIC into OVS and reconfigure OVS to use the host IP\nRequires=openvswitch.service\nWants=NetworkManager-wait-online.service\nAfter=firstboot-osupdate.target\nAfter=NetworkManager-wait-online.service openvswitch.service network.service nodeip-configuration.service nmstate.service\nBefore=kubelet-dependencies.target node-valid-hostname.service\n\n[Service]\n# Need oneshot to delay kubelet\nType=oneshot\nExecStart=/usr/local/bin/configure-ovs.sh OVNkubernetes\nStandardOutput=journal+console\nStandardError=journal+console\n\n[Install]\nRequiredBy=kubelet-dependencies.target\n", "enabled": false, <<< "name": "ovs-configuration.service" }, -- 2.- This in turn causes that master nodes doesn't get the br-ex bridge configured. 3.- In turn this causes ovn-controller to crash in nodes since br-ex is not found: -- F0403 18:57:28.192030 4448 ovnkube.go:137] failed to run ovnkube: [failed to start network controller: failed to start default network controller: unable to create admin network policy controller, err: could not add Event Handler for anpInformer during admin network policy controller initialization, handler {0x1fcc6e0 0x1fcc3c0 0x1fcc360} was not added to shared informer because it has stopped already, failed to start node network controller: failed to start default node network controller: error looking up gw interface: "br-ex", error: Link not found] -- 3.- As the systemd service is incorrect, even if its manually enabled, the related script doesn't match the sdn and returns 0, exiting the loop: -- ExecStart=/usr/local/bin/configure-ovs.sh OVNkubernetes Apr 03 19:01:32 master0-upi.testlab.local configure-ovs.sh[5519]: + '[' OVNkubernetes == OVNKubernetes ']' Apr 03 19:01:32 master0-upi.testlab.local configure-ovs.sh[5519]: + '[' OVNkubernetes == OpenShiftSDN ']' Apr 03 19:01:32 master0-upi.testlab.local configure-ovs.sh[5519]: ++ handle_exit Apr 03 19:01:32 master0-upi.testlab.local configure-ovs.sh[5519]: ++ e=0 4.- As master nodes never become Ready, certificates expires and the installation fails.
Steps to Reproduce:
1. Create manifests and ignition with OVNkubernetes as network type. 2. Deploy an UPI cluster.
Actual results:
Cluster never deploys.
Expected results:
It shouldn't allow to render manifests since comparing the word as case sensitive, it doesn't match OVNKubernetes.
Additional info:
- links to
-
RHEA-2024:11038 OpenShift Container Platform 4.19.z bug fix update