Uploaded image for project: 'OpenShift Container Platform (OCP) Strategy'
  1. OpenShift Container Platform (OCP) Strategy
  2. OCPSTRAT-2413

[Tech Preview] Support Adding Bare Metal Nodes to OpenShift clusters in platform vSphere

XMLWordPrintable

    • Product / Portfolio Work
    • None
    • 50% To Do, 50% In Progress, 0% Done
    • False
    • Hide

      None

      Show
      None
    • False
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Feature Overview

      This feature allows users to extend an existing OpenShift Container Platform (OCP) cluster on vSphere by adding bare metal compute nodes. This capability enables a hybrid architecture, combining the convenience of a virtualized control plane with the performance and efficiency of bare metal worker nodes. The feature supports clusters installed using Installer-Provisioned Infrastructure (IPI), User-Provisioned Infrastructure (UPI), and the Assisted Installer, providing a clear path for customers transitioning their workloads from a fully virtualized environment to a hybrid one without requiring a full cluster reinstallation.

      Goals

      The primary goal of this feature is to enable customers to seamlessly integrate bare metal compute nodes into their existing OCP clusters on vSphere. This provides a flexible and scalable solution for workloads that benefit from running on bare metal, such as those with specific hardware requirements or high-performance computing needs.

      As a cluster administrator, I want to add bare metal compute nodes to my existing OpenShift cluster on vSphere so that I can support workloads on bare metal without reinstalling my entire cluster.

      Requirements

      Functional Requirements

      • Node Provisioning: The system must allow for the addition of new bare metal compute nodes to an existing OCP cluster on vSphere. The provisioning process should be initiated by the cluster administrator.
      • Networking: The solution must support multiple subnets for the machine network to accommodate bare metal nodes in a hybrid setup.
      • Workload Migration: The cluster must support the seamless rescheduling of pods from vSphere VM nodes to the newly added bare metal compute nodes.
      • Cluster Upgrade: The hybrid cluster must support standard OpenShift upgrade procedures. This includes both connected and disconnected environments, utilizing an external registry provided by a customer (for disconnected clusters) or Red Hat repositories (for connected clusters).
      • Node Replacement: The solution must provide a procedure for replacing a failed bare metal compute node, following standard OpenShift practices.
      • [optional] Full automation for the provisioning and decommissioning procedures if possible.

      Non-Functional Requirements

      • Usability: The process for adding and managing bare metal nodes should be as straightforward and well-documented as possible for a cluster administrator.
      • Maintainability: The solution should be integrated with existing OpenShift management tools and procedures to ensure ease of maintenance and troubleshooting.
      • Reliability: The addition and replacement of nodes must not negatively impact the stability or availability of the existing cluster.
      • Security: The feature must adhere to existing OpenShift security rules and best practices.

      Use Case

      1. A customer has an existing OCP cluster on vSphere, installed using one of the following methods:
        1. Installer-Provisioned Infrastructure (IPI)
        2. User-Provisioned Infrastructure (UPI)
        3. Assisted Installer
        4. Agent-based Installer
      2. The customer has an independent the third-party Container Storage Interface (CSI) driver, the vSphere CSI driver is not in use and has been explicitly disabled by the customer by setting `managmentState: Removed` for vSphere CSI driver's ClusterCSIDriver object..
      3. The customer needs to add bare metal compute nodes to the cluster for new or existing workloads.
      4. The cluster administrator adds new bare metal compute nodes to the cluster.
      5. Pods are successfully rescheduled from the original vSphere nodes to the new bare metal nodes.
      6. The customer is able to perform an in-place upgrade of the hybrid cluster.
      7. The customer is able to replace a failed bare metal compute node following a documented procedure.

      Scenarios

      Main Success Scenario: Add, Upgrade, and Replace

      1. A cluster administrator adds a new bare metal compute node to their existing OCP on vSphere cluster using a documented provisioning procedure.
      2. The new bare metal node successfully joins the cluster and is marked as Ready.
      3. The cluster administrator migrates workloads by rescheduling pods from an existing vSphere VM node to the new bare metal node.
      4. The cluster administrator removes the now-empty vSphere VM node, completing the decommissioning process.
      5. The administrator performs a standard OpenShift cluster upgrade, and the hybrid cluster with both vSphere and bare metal nodes upgrades without issue.
      6. In a separate event, a bare metal compute node fails and needs to be replaced. The cluster administrator follows the standard OpenShift procedure to replace the failed bare metal node with a new one. The new node successfully joins the cluster.

      Out of Scope

      • Support and validation of the native vSphere Container Storage Interface (CSI) driver in a hybrid cluster with bare metal nodes.
      • Support and validation of any third-party CSI driver. While a customer might use a third-party driver in their environment, this feature will not provide specific validation or support for it.

      Links

              mzasepa Michal Zasepa
              racedoro@redhat.com Ramon Acedo
              None
              Richard Vanderpool
              None
              None
              Avani Bhatt Avani Bhatt
              Derrick Ornelas Derrick Ornelas
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated: