-
Bug
-
Resolution: Done-Errata
-
Normal
-
4.12.0
-
Quality / Stability / Reliability
-
False
-
-
None
-
None
-
No
-
None
-
None
-
Agent Sprint 233, Sprint 235, Sprint 236, Sprint 238
-
4
-
None
-
Release Note Not Required
-
N/A
-
None
-
None
-
None
-
None
Description of problem:
This may be something we want to either add a validation for or document. It was initially found at a customer site but I've also confirmed it happens with just a Compact config with no workers.
They created an agent-config.yaml with 2 worker nodes but did not set the replicas in install-config.yaml, i.e. they did not set
compute:
- hyperthreading: Enabled
name: worker
replicas: {{ num_workers }}
This resulted in an install failure as by default 3 worker replicas are created if not defined
https://github.com/openshift/installer/blob/master/pkg/types/defaults/machinepools.go#L11
See the attached console screenshot showing that the expected number of hosts doesn't match the actual.
I've also duplicated this with a compact config. We can see that the install failed as start-cluster-installation.sh is looking for 6 hosts.
[core@master-0 ~]$ sudo systemctl status start-cluster-installation.service
● start-cluster-installation.service - Service that starts cluster installation
Loaded: loaded (/etc/systemd/system/start-cluster-installation.service; enabled; vendor preset: enabled)
Active: activating (start) since Wed 2023-03-15 14:40:04 UTC; 3min 41s ago
Main PID: 3365 (start-cluster-i)
Tasks: 5 (limit: 101736)
Memory: 1.7M
CGroup: /system.slice/start-cluster-installation.service
├─3365 /bin/bash /usr/local/bin/start-cluster-installation.sh
├─5124 /bin/bash /usr/local/bin/start-cluster-installation.sh
├─5132 /bin/bash /usr/local/bin/start-cluster-installation.sh
└─5138 diff /tmp/tmp.vIq1jH9Vf2 /etc/issue.d/90_start-install.issueMar 15 14:42:54 master-0 start-cluster-installation.sh[3365]: Waiting for hosts to become ready for cluster installation...
Mar 15 14:43:04 master-0 start-cluster-installation.sh[4746]: Hosts known and ready for cluster installation (3/6)
Mar 15 14:43:04 master-0 start-cluster-installation.sh[3365]: Waiting for hosts to become ready for cluster installation...
Mar 15 14:43:15 master-0 start-cluster-installation.sh[4980]: Hosts known and ready for cluster installation (3/6)
Mar 15 14:43:15 master-0 start-cluster-installation.sh[3365]: Waiting for hosts to become ready for cluster installation...
Mar 15 14:43:25 master-0 start-cluster-installation.sh[5026]: Hosts known and ready for cluster installation (3/6)
Mar 15 14:43:25 master-0 start-cluster-installation.sh[3365]: Waiting for hosts to become ready for cluster installation...
Mar 15 14:43:35 master-0 start-cluster-installation.sh[5079]: Hosts known and ready for cluster installation (3/6)
Mar 15 14:43:35 master-0 start-cluster-installation.sh[3365]: Waiting for hosts to become ready for cluster installation...
Mar 15 14:43:45 master-0 start-cluster-installation.sh[5124]: Hosts known and ready for cluster installation (3/6)
Since the compute section in install-config.yaml is optional we can't assume that it will be there
https://github.com/openshift/installer/blob/master/pkg/types/installconfig.go#L126
Version-Release number of selected component (if applicable):
4.12
How reproducible:
Steps to Reproduce:
1. Remove the compute section from install-config.yaml 2. Do an install 3. See the failure
Actual results:
Expected results:
Additional info:
- blocks
-
OCPBUGS-14432 Installation fails if < 3 workers defined and number of compute replicas not set
-
- Closed
-
- is cloned by
-
OCPBUGS-14432 Installation fails if < 3 workers defined and number of compute replicas not set
-
- Closed
-
- links to
-
RHEA-2023:5006
rpm