Uploaded image for project: 'OpenShift Container Platform (OCP) Strategy'
  1. OpenShift Container Platform (OCP) Strategy
  2. OCPSTRAT-929

[etcd] Vertical scaling on baremetal/UPI clusters

XMLWordPrintable

    • Icon: Feature Feature
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • None
    • etcd
    • False
    • Hide

      None

      Show
      None
    • False
    • 50% To Do, 0% In Progress, 50% Done
    • 0
    • 0
    • Backlog Refinement

      Feature Overview (aka. Goal Summary)

      Currently, the vertical scaling feature relies on machine deletion hooks provided by the Machine API to scale up and scale down control-plane machines.

      The ControlPlaneMachineSet Operator is also required to manage the deletion and creation of machines.

      In environments with User Provisioned Infrastructure (UPI) and a non-functional Machine API, the scale-up and scale-down are manual, and users are expected to manually add and remove machines and etcd members.

      See for background: https://github.com/openshift/enhancements/blob/master/enhancements/etcd/protecting-etcd-quorum-during-control-plane-scaling.md#non-functional-machine-api-scenarios

      For UPI or baremetal scenarios, we have documented the following steps:
      https://docs.openshift.com/container-platform/4.13/backup_and_restore/control_plane_backup_and_restore/replacing-unhealthy-etcd-member.html#restore-replace-stopped-baremetal-etcd-member_replacing-unhealthy-etcd-member

       

      However, steps 4 and onwards rely on using the machine API to provision a new machine.

      If you are running installer-provisioned infrastructure, or you used the Machine API to create your machines, follow these steps. Otherwise, you must create the new control plane node using the same method that was used to originally create it. 

      Goals (aka. expected user outcomes)

      The goal of this Feature is to test out the steps for vertical scaling of the control plane nodes for an environment when the Machine API is not available and outline the full steps to provision and remove a node to replace an unhealthy member.

      Requirements (aka. Acceptance Criteria):

      A list of specific needs or objectives that a feature must deliver in order to be considered complete. Be sure to include nonfunctional requirements such as security, reliability, performance, maintainability, scalability, usability, etc. Initial completion during Refinement status.

      Out of Scope

      High-level list of items that are out of scope. Initial completion during Refinement status.

      Background

      Provide any additional context is needed to frame the feature. Initial completion during Refinement status.

      Customer Considerations

      Provide any additional customer-specific considerations that must be made when designing and delivering the Feature. Initial completion during Refinement status.

      Documentation Considerations

      Provide information that needs to be considered and planned so that documentation will meet customer needs. If the feature extends existing functionality, provide a link to its current documentation. Initial completion during Refinement status.

      Interoperability Considerations

      Which other projects, including ROSA/OSD/ARO, and versions in our portfolio does this feature impact? What interoperability test scenarios should be factored by the layered products? Initial completion during Refinement status.

            wcabanba@redhat.com William Caban
            wcabanba@redhat.com William Caban
            Matthew Werner Matthew Werner
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: