Uploaded image for project: 'Service Delivery (SD) Strategy'
  1. Service Delivery (SD) Strategy
  2. SDSTRAT-40

AMS / OSL / OCM-Resources Security & Resiliency

XMLWordPrintable

    • Icon: Initiative Initiative
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • OCM
    • False
    • Hide

      None

      Show
      None
    • False
    • Not Selected
    • 50% To Do, 25% In Progress, 25% Done
    • 0

      Goal

      What is our purpose in implementing this? What are we enabling by doing this work? Time-box goals to 4-6 months.

      Implement changes to components, deployments, and processes to ensure OCM's high availability and secure functionality. This includes major one-time resolution of gaps and debt and the operationalization of resiliency to ensure resiliency remains top-of-mind in the future.

      Benefit Hypothesis:

      What are the benefits (to Red Hat, eventually to customers, to the community, etc.)? Does it improve security, performance, supportability, etc? Why is work a priority?

      Overall, this effort should achieve a reduced risk of Red Hat's reputational impact through service outages and incidents.

      After completion, we should:

      • know the risks the service is currently exposed to
      • rely on accurate measurements of and alerts on critical user journey success
      • have a reduced blast radius for breaking changes
      • drive development based on SLO performance

      Resources

      Add any resources (docs, slides, etc.) pertinent to the definition of the work. These might not be known until later. Update as necessary.

      Responsibilities

      Indicate which roles and/or teams will be responsible for contributing to the initiative and generally what they might be expected to do.

      Success Criteria

      Provide some examples of how we will know if we have achieved the goal. What can be measured to determine success? What observable actions/outcomes that can be seen to determine success? Specific, Measurable, Achievable, fits within the Time-box.

      • Critical user journeys for AMS, OSL, and OCM-resources are well defined
      • SLOs against critical user journeys are agreed upon, visualized, alerted upon, and regularly reviewed
      • SOPs and error budget policies exist to ensure a breach in an SLO results in the right prioritization of efforts to remediate customer impact based on the severity of the impact
      • All components adhere to the Red Hat Secure Development Lifecycle
        • Including SAST enabled through Konflux build tool migration
      • Known database availability risks are mitigated
      • Known high-traffic resource contention issues are resolved

      Results

      Add results here once the Initiative is started. Recommend discussions & updates once per quarter in bullets.

            ehimmelr.openshift Eric Himmelreich
            rhn-support-tiwillia Timothy Williams
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: