Uploaded image for project: 'Red Hat Advanced Cluster Management'
  1. Red Hat Advanced Cluster Management
  2. ACM-6832

ACM 2.9: client-go updates required to support OCP 4.14

XMLWordPrintable

    • cross-squad - client-go updates may be required for OCP 4.14
    • False
    • None
    • False
    • Green
    • To Do
    • 0% To Do, 0% In Progress, 100% Done

      OCP/Telco Definition of Done
      <https://docs.google.com/document/d/1TP2Av7zHXz4_fmeX4q9HB0m9cqSZ4F6Jd4AiVoaF_2s/edit#heading=h.gaa58bzbvwde>
      Epic Template descriptions and documentation.
      <https://docs.google.com/document/d/14CUCEg6hQ_jpsFzJtWo29GfFVWmun2Uivrxq3_Fkgdg/edit>
      ACM-wide Product Requirements (Top-level Epics)
      <https://docs.google.com/document/d/1uIp6nS2QZ766UFuZBaC9USs8dW_I5wVdtYF9sUObYKg/edit>

      *<--- Cut-n-Paste the entire contents of this description into your new
      Epic --->*

      Epic Goal

      ...

       

      OCP 4.14 installations may have compatibility issues with older versions of client-go

      (versions older than 0.26.4).

       

      Upstream kubernetes issue: https://github.com/kubernetes/kubernetes/pull/116603

       

      This issue was found by the ODF team - linking to their issue (see comment 8): https://bugzilla.redhat.com/show_bug.cgi?id=2228319

       

      • Essentially this can happen in environments with kube 1.27+ where aggregated discovery is enabled.  So this has been noticed in OCP 4.14 only so far.
      • Additionally it may not happen in all environments, it may need metrics enabled, and may need something to have a malformed discovery response (the kube issue mentioned metrics is one that does this).
      • The results is the client-go may crash, so controllers that have seen this (VolSync, ODF, submariner so far) crash at startup and go into crashloopbackoff.
      • One other note - it seems that it could be related to controllers that use client-go < 0.26.4 and also serve metrics - possible if the controller doesn't serve metrics, it may not be affected - I don't know for sure at this point.

       

      Since this appears to be OCP 4.14 only, I've created this against ACM 2.8 and the upcoming 2.9.

       

      Current ACM controllers ODF has found that have the issue:
      klusterlet-addon-workmgr-84788454cd-dd2mn 0/1 CrashLoopBackOff 11 (109s ago) 3d18h

      Why is this important?

      • This will definitely be needed for future OCP 4.14 support
      • This is a blocker for the ODF team as they're testing their upcoming ODF 4.14 release against OCP 4.14 and have a dependency on ACM.

      Scenarios

      ...

      Acceptance Criteria

      ...

      Dependencies (internal and external)

      1. ...

      Previous Work (Optional):

      1. ...

      Open questions:

      Done Checklist

      • CI - CI is running, tests are automated and merged.
      • Release Enablement <link to Feature Enablement Presentation>
      • DEV - Upstream code and tests merged: <link to meaningful PR or GitHub
        Issue>
      • DEV - Upstream documentation merged: <link to meaningful PR or GitHub
        Issue>
      • DEV - Downstream build attached to advisory: <link to errata>
      • QE - Test plans in Polarion: <link or reference to Polarion>
      • QE - Automated tests merged: <link or reference to automated tests>
      • DOC - Downstream documentation merged: <link to meaningful PR>

            njean@redhat.com Nelson Jean
            tflower@redhat.com Tesshu Flower
            Hui Chen Hui Chen
            Votes:
            0 Vote for this issue
            Watchers:
            12 Start watching this issue

              Created:
              Updated:
              Resolved: