Uploaded image for project: 'OpenShift Edge Enablement'
  1. OpenShift Edge Enablement
  2. OCPEDGE-795

Confirmation and Prioritization of Resource Saving Potentials in SNO vDUs

XMLWordPrintable

    • Icon: Epic Epic
    • Resolution: Obsolete
    • Icon: Normal Normal
    • None
    • None
    • SNO
    • None
    • Confirmation and Prioritization of Resource Saving Potentials in SNO vDUs
    • Future Sustainability
    • False
    • Hide

      None

      Show
      None
    • False
    • Not Selected
    • M
    • None
    • None
    • 0

      OCP/Telco Definition of Done
      Epic Template descriptions and documentation.

      <--- Cut-n-Paste the entire contents of this description into your new Epic --->

      Epic Goal

      • Confirm potential savings and gains in the measurements made by referencing previous analysis in Telco
      • Create a Prioritization by possible Savings and escalate Epics where necessary to achieve better overall footprint
      • Use data to argument for higher priorities and create escalation paths based on data driven priorities
      • Create further Epics in https://issues.redhat.com/browse/OCPSTRAT-1087 for further analysis that is required

      Why is this important?

      • The savings outlined by Telco signal key areas of improvement in terms of communication with the API Server that usually lie in a few teams hands. To give these teams better insight into the priority but also why the consumption is high, we have to provide them with data and insights from our measurements.
      • Manual tracking of the KPIs for 1C will not be enough to drive decisions. We must instead analyze the meaning of the measurements manually and drive an action plan on the created epics
      • Wherever the Priorities do not align we will have to escalate or step in by either introducing new Initiatives or escalating existing Initiatives

      Scenarios

      1. ACM klusterlet Impact as well as API Server / crio / kubelet Impact need to be analyzed as can be seen here
      2. We need to measure all available Health Probes in a SNO cluster and keep track of them. We should try to limit these wherever we can
      3. We need to check why CatalogSources require CPU at all since they technically should not be used when they are idling.
      4. We need to revisit all composable Openshift Features and see if Telco has integrated all of these efforts
      5. We need to pickup the potential Exclusion of OLM and installing them via ACM instead as a discussion with Product Management

      Acceptance Criteria

      • There must be a dedicated List of all actions planned by Openshift Core that target Resource Consumption on SNO and this will need to be maintained in the future
      • There must be a dedicated Prioritization of all topics that are already being handled, or that need escalation

      Dependencies (internal and external)

      • The base dependency for this is a reproducible measurement methodology with which we can confirm the overall resource saving potentials before acting on follow up prioritization

      Open questions::

      • How will the list of actions on resource Reduction be tracked so it does not get lost within the Openshift ecosystem again?
      • We need to further dive into how the metrics were collected. Ideally for this Confirmation, the Engineer should setup a Telco Metal instance as used in the Testing. We should receive this via https://redhat.enterprise.slack.com/archives/C013TS9TQ4F which will serve as a testing ground until we have a better prepared environment 

      Done Checklist

      • DEV - Action List: <link to meaningful Jira Issues, prioritized and prepared correctly>
      • DEV - Finding Summary: <summary included in ticket, best in presentation format for next action items>

              Unassigned Unassigned
              rh-ee-jmoller Jakob Moeller (Inactive)
              None
              None
              None
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: