Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-54738

CGU second batch fails when one cluster powered off regression

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done-Errata
    • Icon: Undefined Undefined
    • None
    • 4.16.z, 4.18.z, 4.19
    • TALM Operator
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Important
    • Yes
    • None
    • Proposed
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

          When applying a CGU with two clusters and max remediation of 1, where the first is expected to fail due to being powered off, the second should still succeed. However, this does not happen and .status.clusters.<should succeed>.currentPolicy is stuck NonCompliant even though the policy itself is compliant. Checking the TALM pod logs shows a repeating panic during reconcile that seems to be the source of the issue.

      Here is the line of code causing the panic, the powered off cluster does not have an entry in the CurrentBatchRemediationProgress so this is a nil pointer and setting its FirstCompliantAt field is a nil pointer dereference.

      Version-Release number of selected component (if applicable):

          lastest brew versions of TALM for 4.16.z, 4.18.z, and 4.19

      How reproducible:

          always, showed in 4.16, 4.18, and 4.19 CI after introduced

      Steps to Reproduce:

      covered by automation

      1. Create CGU with max concurrency of 1 and two clusters where the first cluster is powered off
      2. Wait for the second cluster to complete (it won't even though policies are all compliant)     

      Actual results:

          Neither batch succeeds and there is a panic in the pod logs

      Expected results:

      First batch fails due to timeout but the second batch succeeds    

      Additional info:

      google drive with all available logs

              saskari@redhat.com Saeid Askari
              rh-ee-klaskosk Kirsten Laskoski
              None
              None
              Kirsten Laskoski Kirsten Laskoski
              None
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: