Uploaded image for project: 'OpenShift GitOps'
  1. OpenShift GitOps
  2. GITOPS-2039

OpenShift GitOps is failing to report ArgoCD instance problems

XMLWordPrintable

    • Introduce Instance/Operator level metrics/monitoring for OpenShift GitOps
    • 5
    • False
    • None
    • False
    • 0% To Do, 0% In Progress, 100% Done
    • GITOPS Sprint 221

      When running Red Hat OpenShift GitOps, it's possible to create additional instances beside the one created in openshift-gitops using ArgoCD resource.

      The problem is, the OpenShift GitOps Operator that does not report any metric about overall instance health and availability. Meaning, malfunctioning instances are hard to catch and fix, which can have impact for production environments.

      For example, when ResourceQuota is preventing the redis pod from starting, nothing is reported beside the state of the actual ArgoCD instance.

      $ oc get argocd openshift-gitops -o json | jq '.status'
      {
        "applicationController": "Running",
        "dex": "Running",
        "host": "openshift-gitops-server-project-100.apps.foo.bar.intra",
        "phase": "Available",
        "redis": "Pending",
        "repo": "Running",
        "server": "Running",
        "ssoConfig": "Success"
      }
      

      We can see that "redis": "Pending", but beside that, other ArgoCD functions may operate as intended or potentially not as intended. Further, the OpenShift GitOps Operator is constantly trying to reconcile the respective resource but without success (unless the ResourceQuota is adjusted).

      Since OpenShift Container Platform 4 - Cluster can grow big and can have many different ArgoCD instance, it's required to have a way that can detect problematic states in the respective ArgoCD instance and report it in a central location Operator Condition or via metrics to tirgger an alert and make sure the problem can be solved (and also Cluster Administrators are aware about a potential problem).

      Acceptance Criteria

      • Research how to enable OpenShift Monitoring to watch/monitor instance status (as above).
      • Create metric
      • Get it to work and document the steps in User Guide

              jrao@redhat.com Jaideep Rao
              rhn-support-sreber Simon Reber
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: