Uploaded image for project: 'OpenShift GitOps'
  1. OpenShift GitOps
  2. GITOPS-6500

Customers updating to 1.15.1 hit a "error setting app health" error

XMLWordPrintable

    • Before this update, upgrading the Openshift Gitops operator to v1.15.1, raised a healtcheck error that prevented syncing ACM policies. This update fixes the issue by adding a missing nil check to status.placement for Policy.
    • Important

      Description of Problem

      After upgrading the Openshift Gitops operator to the latest version from 1.15.0 to 1.15.1 a healtcheck error is raised that prevents syncing ACM policies.
      The customer is having the following error:

      error setting app health: failed to get resource health for "Policy" with name "policy-add-ocp-allowed-prod-registries" in namespace "openshift-gitops": <string>:10: __len undefined stack traceback: <string>:10: in main chunk [G]: ?
      

      Additional Info

      a PR was created that will get merged into gitops 1.16 : https://github.com/argoproj/argo-cd/pull/22057

      Problem Reproduction

      upgrade 1.15.0 to 1.15.1 in an environment that uses policies

      Reproducibility

      not reproduced

      Prerequisites/Environment

      1.15.0

      Steps to Reproduce

      • install 1.15.0
      • deploy policies similar to customer's usage
      • upgrade to 1.15.1

      Expected Results

      no error

      Actual Results

       error setting app health: failed to get resource health for "Policy" with name "policy-add-ocp-allowed-prod-registries" in namespace "openshift-gitops": <string>:10: __len undefined stack traceback: <string>:10: in main chunk [G]: ?
      

      Problem Analysis

      see thread mentioned in private post

      Root Cause

      a lengh check is performed on status.placement without determinating if it's not nil first.

      Workaround (If Possible)

      This can be worked around by adding this to the argocd instance :

      spec:
        resourceHealthChecks:
          - group: policy.open-cluster-management.io
            kind: Policy
            check: |
              hs = {}
              hs.status = "Healthy"
              hs.message = "Health check overridden"
              return hs 
      

      Fix Approaches

      The PR is going to make it to 1.16

      Acceptance Criteria

      • ...

      Definition of Done

      • Code Complete:
        • All code has been written, reviewed, and approved.
      • Tested:
        • Unit tests have been written and passed.
        • Ensure code coverage is not reduced with the changes.
        • Integration tests have been automated.
        • System tests have been conducted, and all critical bugs have been fixed.
        • Tested and merged on OpenShift either upstream or downstream on a local build.
      • Documentation:
        • User documentation or release notes have been written (if applicable).
      • Build:
        • Code has been successfully built and integrated into the main repository / project.
        • Midstream changes (if applicable) are done, reviewed, approved and merged.
      • Review:
        • Code has been peer-reviewed and meets coding standards.
        • All acceptance criteria defined in the user story have been met.
        • Tested by reviewer on OpenShift.
      • Deployment:
        • The feature has been deployed on OpenShift cluster for testing.

              Unassigned Unassigned
              rhn-support-fdewaley Felix Dewaleyne
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: