Uploaded image for project: 'OpenShift UX Product Design'
  1. OpenShift UX Product Design
  2. PD-1645

Milestone-3: UX OCM UI Support for PidsLimit on ROSA Classic (Day-2)

XMLWordPrintable

    • Milestone-3: OCM UI Support for PidsLimit on ROSA Classic
    • False
    • None
    • False
    • XCMSTRAT-382 - Milestone-3: OCM UI Support for PidsLimit on ROSA Classic
    • Admin UXD Sprint 243, Admin UXD sprint 244, Admin UXD Sprint 245

       

      Figma design file

       

      _________________________________________

      Feature Overview (aka. Goal Summary)  

      This feature will introduce Process IDs (PIDs) as a node-level resource for application pods that customers can manage and control.

      Process IDs and the number of processes are a fundamental resource on Linux hosts. Even when other resources like CPU, Storage, and Memory are available it is possible for some Pods to run out of process IDs and fail.

      This feature will allow customers to increase/set PIDs per Pod as allowed by the node allocatable. The feature will be delivered across multiple milestones to cover for all use cases (cluster level, per-machinepool level) across different topologies (HCP, Classic).

      1. M1 / XCMSTRAT-110 - API and ROSA CLI support on ROSA Classic
      2. M2 / XCMSTRAT-355 - Support for ROSA HCP 
      3. M3/ XCMSTRAT-382 - Support for all clients (UI,TF), Per-Machinepool, all allowed Pidlimit values 
      4. Backlog/XCMSTRAT-383 - Support for day-1 (cluster installation)

      This Jira is pruned to include the first milestone: providing cluster-wide configuration on ROSA and OSD on AWS clusters. 

      Goals (aka. expected user outcomes)

      • Configure podpidslimit for all worker nodes (i.e., all nodes of machine pools; all cluster nodes that are not control plane nodes)
      • PodPidsLimit values from 4096 (default) to 16,384 (soft limit) available to all clusters
      • No impacts to the control plane nodes.
      • When not set, the default value provided by OCP version will be applied. 
      • Support on OCP 4.11 and above
      • Customer can use ROSA CLI (MVP), OCM UI (follow-up) and Terraform (follow-up) to set this
      • Ability to modify this on an existing cluster - all nodes will be rebooted one at a time - potentially causing workload disruption
      • ROSA CLI and OCM UI to provide warning that changing this value will require machine pool nodes to reboot and disrupt the applications
      • Ability to set this configuration at the time of cluster creation (follow-up)
      • Support for ROSA clusters and OSD CCS on AWS clusters
      • ROSA and OSD DOCs updated how to use the feature.
      • OCM includes the field in the telemetry for tracking analytics on clusters that override the default values. 

      Documentation

      • The feature needs to be covered both creating and editing machine pools section as requested in the OSDOCS-6267. i.e., cover in the day-2 workflows.
      • Provide an use case or reasoning to set this value from other than the default. 
      • Provide a section on considerations including:
        • what happens if the value is not set
        • what happens when the value is updated (rolling over to machines with reboot, disruptive to workloads)
        • what happens when the value set gets exhausted (pods restarted/rescheduled?) etc.  

      Additional Information:

      • Opportunity: With the standard default kubelet configuration that only allows a fixed 4K limit on PIDs per Pod, those workloads that need more PIDs per Pod are unable to run and operate on Managed Services. The prospects and customers who today do that in self-managed OCP are unable to adopt ROSA because of missing configurability.

      References:

      1. Kubelet configuration spec part of Machine API : https://docs.openshift.com/container-platform/4.13/rest_api/machine_apis/containerruntimeconfig-machineconfiguration-openshift-io-v1.html#spec-containerruntimeconfig
      2. Kubernetes documentation on per-pod PIDs https://kubernetes.io/docs/concepts/policy/pid-limiting/ 

              llyman@redhat.com Lisa Lyman
              llyman@redhat.com Lisa Lyman
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: