-
Feature
-
Resolution: Unresolved
-
Minor
-
None
-
None
-
Product / Portfolio Work
-
None
-
False
-
-
False
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Feature Overview
This feature introduces the ability for MachineSets to reference predefined failureDomains from the Infrastructure Custom Resource (CR) by name. Currently, administrators must manually duplicate infrastructure details (datacenter, datastore, cluster, network, and resourcePool) within every MachineSet definition. By allowing a direct reference (e.g., failureDomain: zone-a), we reduce configuration sprawl, minimize manual entry errors, and ensure consistency across multi-zone high-availability (HA) deployments on VMware vSphere.
Goals
- Operational Efficiency: Enable users to manage infrastructure parameters in a single location (the Infrastructure CR) rather than across dozens of MachineSets.
- Persona: Primary user is the Cluster Administrator responsible for scaling and maintaining multi-zone OpenShift clusters.
- Improved Consistency: Eliminate "configuration drift" where individual MachineSets in the same logical zone accidentally point to different resources due to manual typos.
- Standardization: Align the MachineSet experience with the Control Plane Machine Set (CPMS), which already utilizes failure domains for high availability.
Requirements
- Functional:
- The MachineSet API must be extended to include a failureDomain reference field for vSphere.
- The system must resolve the named failure domain to its constituent infrastructure parameters (datacenter, computeCluster, resourcePool, datastore, networks, etc.) during machine provisioning.
- The feature must support all installation types, including Installer-Provisioned Infrastructure (IPI) and User-Provisioned Infrastructure (UPI), provided the Infrastructure CR is correctly populated.
- Technical Architecture:
- Implementation should prioritize Cluster API (CAPI) via the cluster-api-provider-vsphere if available in the target version.
- If CAPI is not yet the default for the platform version, the functionality must be implemented in the Machine API (MAPI) machine-api-provider-vsphere with a clear migration path to CAPI.
- Non-Functional:
- Backward Compatibility: Existing MachineSets with explicitly defined infrastructure fields must continue to function without modification.
- Reliability: The provider must handle cases where a referenced failure domain is missing or renamed, providing clear error status in the MachineSet conditions.
Use Case
Scenario: Scalable Multi-Zone Management
"As a Cluster Administrator, I want to create a new MachineSet by simply referencing zone-beta so that I don't have to look up and copy-paste the specific vCenter folder paths, datastore names, and network IDs for that specific rack."
Questions to Answer (Engineering/Design)
- Precedence: If a user defines both a failureDomain reference AND an explicit datastore in the same MachineSet, which takes priority?
- Validation: Should an admission webhook be implemented to reject MachineSets that reference non-existent failure domains?
- CAPI Alignment: How does this map to the VSphereMachineTemplate in the upstream Cluster API provider?
Out of Scope
Links
- clones
-
OCPSTRAT-2934 [SPIKE] Support for vCenter replacement
-
- New
-
- is cloned by
-
OCPSTRAT-2936 Installer-Provisioned Infrastructure (IPI) Support for Azure VMware Solution (AVS)
-
- New
-