Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-62945

[kueue] Surface cert-manager dependency error in Kueue CR status

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Minor Minor
    • None
    • 4.18, 4.19
    • Node / Kueue
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      When the Kueue operator is installed on a cluster where cert-manager is not present, the installation fails, but this failure is not clearly visible to the user. 
      
      The error is only surfaced in the operator's logs, and the Kueue Custom Resource (CR) is not reconciled, meaning its status field remains empty. This makes troubleshooting difficult as the primary user-facing resource does not indicate a problem.
      
      To improve visibility and user experience, the Kueue operator should update the status of the Kueue CR to reflect this critical dependency failure. 
      
      The Degraded condition could be used to signal that the operator is not functioning correctly.This will allow other operators and monitoring systems (like RHOAI) to programmatically detect the issue and bubble the problem up to their top-level CRs, making the root cause immediately apparent to the end-user.

      Version-Release number of selected component (if applicable):

          

      How reproducible:

         Always

      Steps to Reproduce:

          1. Install the Kueue Operator without installing cert-manager
          2. Create a Kueue CR
          3. Observe the Kueue CR not being reconciled
          4. Search for errors in the Kueue Operator log
          

      Actual results:

      E1010 09:43:42.519220 1 base_controller.go:279] "Unhandled Error" err="KueueOperator reconciliation failed: please make sure that cert-manager is installed on your cluster"
      E1010 09:43:42.536274 1 target_config_reconciler.go:191] please make sure that cert-manager is installed
      

      Expected results:

      status:
        conditions:
          - lastTransitionTime: '2025-10-10T09:47:38Z'
            message: 'please make sure that cert-manager is installed on your cluster'
            reason: MissingDependency
            status: 'True'
            type: Degraded
      

      Additional info:

          

              rphillip@redhat.com Ryan Phillips
              lburgazz@redhat.com Luca Burgazzoli
              None
              None
              Cameron Meadors Cameron Meadors
              None
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated: