Uploaded image for project: 'OpenShift Specialist Platform Team'
  1. OpenShift Specialist Platform Team
  2. SPLAT-2394

[AWS NLB SG]: CCM-upstream CI bug: increase resource of e2e job to remediate frequently OOMKilled events

    • Quality / Stability / Reliability
    • True
    • Hide
      Show
      2025.08.07: A documentation PR has been updated adding references of the lessons learned from the investigation that generated the resource bump. Awaiting for reviewer: https://github.com/kubernetes/cloud-provider-aws/pull/1221 2025.08.06: Awaiting for feedback on job change on e2e limit: https://github.com/kubernetes/test-infra/pull/35274
    • False
    • 5
    • 3
    • None
    • None
    • OpenShift SPLAT - Sprint 275

      User Story:
      As an OpenShift Engineer I want to report and propose investigation on CI of jobs randomly getting OOMKilled by Prow, impacting the feature readiness, so that we can increase velocity and confidence on features proposed to upstream cloud-provider-aws

       

      Description:
      < Record any background information >

       

      Acceptance Criteria:

      • Open an upstream issue reporting the problem
      • Open an PR proposing increasing the job resource limit
      • Check any room for optimization to the step (maybe using pre-built kops binary instead of downloading every time?)
      • Open a PR updating development document, CI section, to upstream refrencing the Grafana dashboard 

      Other Information:
      < Record anything else that may be helpful to someone else picking up the card >

      issue created by splat-bot

        1. Screenshot From 2025-08-06 21-03-32.png
          176 kB
          Marco Braga
        2. Screenshot From 2025-08-06 21-05-59.png
          177 kB
          Marco Braga
        3. Screenshot From 2025-08-06 21-06-13.png
          184 kB
          Marco Braga

              rhn-support-mrbraga Marco Braga
              rhn-support-mrbraga Marco Braga
              None
              None
              None
              None
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: