Uploaded image for project: 'OpenShift Top Level Product Strategy'
  1. OpenShift Top Level Product Strategy
  2. OCPPLAN-9764

Behavior detection driven recommenders in VPA

XMLWordPrintable

    • Icon: Feature Feature
    • Resolution: Done
    • Icon: Major Major
    • None
    • None
    • None
    • None
    • False
    • False
    • Not Set
    • No
    • Not Set
    • Not Set
    • Not Set
    • Undefined

      OCP/Telco Definition of Done
      Epic Template descriptions and documentation.

      <--- Cut-n-Paste the entire contents of this description into your new Epic --->

      Epic Goal

      • Today, the current VPA recommends CPU/Mem requests and limits based on one single method, recommending future usage using the percentile usage observed in the past time window.
      • The goal is to provide support of multiple customized recommenders in VPA that apply statistical methods to detect resource usage behavior patterns to recommend CPU/memory requests and limits in auto-scaling.
      • The default VPA is purely based on historical observations. It would fail when resource usage behavior is trending, periodically changing, or have occasional spikes, resulting in significant over-provisioning and a lot of OOM kills for microservices.
      • As an OCP admin, I want to learn different types of resource usage behaviors and apply different algorithms to improve cluster resource utilization ( CPU and Memory) predictions, which can significantly reduce over-provisioning and OOM kills in auto-scaling.

      Why is this important?

       

      Scenarios

      1. IBM Watson clusters do not use VPA and are significantly over-provisioning resources.
      2. The majority of IBM Watson workload exhibit stationary, trending, and periodical resource usage patterns, where statistical-based approaches such as stationarity detection, trending detection, and periodicity detection can significantly improve resource usage prediction than naive approaches that predict future usage based on percentile usage observed in rolling time windows.

      Acceptance Criteria

      • CI - MUST be running successfully with tests automated
      • Release Technical Enablement - Provide necessary release enablement details and documents.
      • ...

      Dependencies (internal and external)

      1. External: The Kubernetes Auto-scaling community needs to approve our proposal of changes on supporting customized VPA recommenders.
      2. Internal: The OpenShift VPA operator needs to support customized VPA recommenders.

      Previous Work (Optional):

      1. Stationarity/trending detection based resource requests/limits recommenders for container vertical auto-scaling.
      2. Numerical evaluation of the effectiveness of stationarity/trending detection based recommenders.

      Open questions::

      1. Changes needed to support customized VPA recommenders.
      2. The current trending detection algorithm needs to be improved.
      3. The periodicity detection based recommender is missing.

      Done Checklist

      • CI - CI is running, tests are automated and merged.
      • Release Enablement <link to Feature Enablement Presentation>
      • DEV - Upstream code and tests merged: <link to meaningful PR or GitHub Issue>
      • DEV - Upstream documentation merged: <link to meaningful PR or GitHub Issue>
      • DEV - Downstream build attached to advisory: <link to errata>
      • QE - Test plans in Polarion: <link or reference to Polarion>
      • QE - Automated tests merged: <link or reference to automated tests>
      • DOC - Downstream documentation merged: <link to meaningful PR>

        1. Statistics_VPA_Problem.png
          817 kB
          Gaurav Singh
        2. VPA.png
          698 kB
          Gaurav Singh

              gausingh@redhat.com Gaurav Singh
              gausingh@redhat.com Gaurav Singh
              Chen Wang (Inactive)
              Mrunal Patel Mrunal Patel
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: