Uploaded image for project: 'Red Hat Workload Availability'
  1. Red Hat Workload Availability
  2. RHWA-177

Releasing Independent must-gather Image for RHWA

XMLWordPrintable

    • Icon: Task Task
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • None
    • must-gather
    • None

      Following previous attempts and the use of must-gather images by team Dragonfly, we should aim for one independent must-gather. See more at the Slack discussion https://redhat-internal.slack.com/archives/C03M5GKJNBA/p1752579025224009 

      Dragongly must-gather Images

      We have been releasing 4 different images of must-gather, and they have been available (pullable) since they were released.
      Each image was released under the same release as another operator; thus, it included the operator's name and his version (e.g., NMO must-gather has "node-maintenance" in the container name and version as v4.10.0 - workload-availability/node-maintenance-must-gather-rhel8:v4.10.0). Even though the containers were released in the same errata advisory release of another operator, they were never installed when the operator was installed, and it's the user's responsibility (and interaction) to use our custom must-gather image.
      They have been documented under the general OCP docs for Gathering data about your cluster, as supported must-gather images:

      1. workload-availability/node-maintenance-must-gather-rhel8 (v4.10.0-v5.0.1)- The latest OCP to be documented is 4.12 (supported only for EUS term 2)*
      2. workload-availability/self-node-remediation-must-gather-rhel8 (v0.4.0-v0.5.1)- The latest OCP to be documented is 4.12 (supported only for EUS term 2)*
      3. workload-availability/node-healthcheck-must-gather-rhel8 (v0.5.0-v0.8.2)- Documented on OCP 4.13+ with the note of "Use this image if your NHC Operator version is earlier than 0.9.0."**, thus it is mentioned to be supported for OCP 4.13 only. But OCP 4.13 is EOL, so the workload-availability/node-healthcheck-must-gather-rhel8 is also EOL from the documented support POV.
      4. workload-availability/node-healthcheck-must-gather-rhel9 (v0.9.0)- Documented on OCP 4.13+ with the note of "Use this image if your NHC Operator version is 0.9.0. or later."**, thus it is mentioned to be supported for OCP 4.14+.

       * See OCP support https://access.redhat.com/support/policy/updates/openshift 

       ** NHC v0.9.0 was released to OCP 4.14+

      Combine the must-gather Images into one NHC must-gather

      In ECOPROJECT-1063 we have been working on reducing our efforts and maintaining one image for all the operators under RHWA. Due to the usage of CPaaS for building and releasing containers, we have decided to release our upstream must-gather, https://github.com/medik8s/must-gather, with NHC to minimize the large overhead of releasing it independently. 

      The transition happened with the last must-gather images of SNR and NMO on 23 Oct 2023, the first NHC (RHEL8) to be released was on 02 May 2023, and the first NHC (RHEL9) to be released was on 21 Jan 2025.

      Release the Independent must-gather Image for RHWA

      As for now, all the above 4 must-gather images are pullable, and are currently available in the catalog without any limitation of support from the customer POV.
      To state they are not supported, we can deprecate them or have a support page that is visible to the customer and state they are unsupported, such as https://access.redhat.com/support/policy/updates/openshift_operators#rolling-stream, or release a new patch release with a warning that it is no longer supported.

      We are migrating to Konflux, and with Konflux, it seems like there is much lower overhead of setting the build and release of an independent must-gather image, rhwa-must-gather:v0.1.0. It would be released under a new (non-associated) application and component names for a lower overhead.
      I believe it would ease the maintenance of updating the must-gather image every time we release one of our operators (such as NHC). 
      ATM, a change in the must-gather would cause the release of NHC (with its other 3 containers).

      Open questions

      • Do we want to release it as part RHWA-152 or only after the migration?
      • Versioning- Will it be v0.1.0 and diverge from https://github.com/medik8s/must-gather/tags, or should it be aligned, upstream and downstream, with a new version/tag?
      • How to handle the "old" must-gather images supportability/depreciation?

              rh-ee-slevi Shai Shimon Levi (Inactive)
              oraz@redhat.com Or Raz
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: