Uploaded image for project: 'OpenShift GitOps'
  1. OpenShift GitOps
  2. GITOPS-2457

Allow User to enable instance level monitoring/alerts on their workloads

XMLWordPrintable

    • Icon: Story Story
    • Resolution: Done
    • Icon: Major Major
    • 1.8.0
    • None
    • Operator
    • None
    • GITOPS Sprint 229, GITOPS Sprint 230

      Story (Required)

      As a cluster admin using OpenShift GitOps on my cluster, running multiple Argo CD instances, I would like to be able to enable monitoring/alerts on some(or all) my instance workloads so that I am alerted if they become unavailable for long periods of time. 

      Background (Required)

      Users need to be able to express which workloads they need be alerted about. This allows operator to know which worloads to create prometheus rules for once the metrics about all workloads have been made available at a new endpoint 

      Out of scope

      Writing code to create new metrics within the operator

      Approach (Required)

      For this story we must:

      • Create servicemonitor so that prometheus can watch the operator service for new exposed /metrics port 
      • Add new field in the CR to capture whether monitoring is needed for each individual instance (For e.g  `.spec.monitoring.enabled=true`) For argocd workloads.
      • for non-core workloads (like SSO, notifications) the feature should be enabled in order to create the rule 
      • If monitoring is disabled, clan up all resources created for it including prometheusRules and servicemonitor 

      Dependencies

      no dependencies

      Acceptance Criteria (Mandatory)

      Operator must successfully create servicemonitor and required prometheusRules. Argo CD CRD must be updated with new fields for monitoring 

      INVEST Checklist

      Dependencies identified

      Blockers noted and expected delivery timelines set

      Design is implementable

      Acceptance criteria agreed upon

      Story estimated

      Legend

      Unknown

      Verified

      Unsatisfied

      Done Checklist

      • Code is completed, reviewed, documented and checked in
      • Unit and integration test automation have been delivered and running cleanly in continuous integration/staging/canary environment
      • Continuous Delivery pipeline(s) is able to proceed with new code included
      • Customer facing documentation, API docs etc. are produced/updated, reviewed and published
      • Acceptance criteria are met

            jrao@redhat.com Jaideep Rao
            jrao@redhat.com Jaideep Rao
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: