Uploaded image for project: 'OpenShift Hosted Control Plane'
  1. OpenShift Hosted Control Plane
  2. HOSTEDCP-592

AWS quotas break HC/NodePool workflows transparently

XMLWordPrintable

    • Icon: Story Story
    • Resolution: Done
    • Icon: Critical Critical
    • None
    • None
    • None
    • Hypershift Sprint 18, Hypershift Sprint 19
    • 0
    • 0
    • 0

      Context:

      https://coreos.slack.com/archives/C02LM9FABFW/p1665056420417879

      There are a few cases where AWS quotas might disrupt the HC/NodePool expected workflows:

      • Case 1: When creating a new cluster via cli. The helper to create the infra supporting the upcoming HC might fail.
      • Case 2: During the reconciliation workflow, resources that require aws infrastructure, e.g. Service type loadbalancer might fail.

      For the later we attempted and codified the logic to capture kube controller manager specific events (https://github.com/openshift/hypershift/pull/1135) so hitting quotas would be signalled in the HC.conditions.

      However based on the thread above, it seems we are missing these events and so hitting quotas breaks the HC/NodePool reconciliation workflow without communicating to the user in anyway

      DoD:

      Discuss and signal to consumers meaningfully when aws quotas breaks HC/NodePool reconciliation.

      Note: consider having a catch all condition that captures all "warning" events in the cp namespace.

      https://docs.google.com/document/d/1z77D8Gfj3DDjFtXNc9fv3xQF9JQkFKWDH-lLZqSLOVY/edit

       

            cewong@redhat.com Cesar Wong
            agarcial@redhat.com Alberto Garcia Lamela
            Jie Zhao Jie Zhao
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved: