Uploaded image for project: 'OpenShift Request For Enhancement'
  1. OpenShift Request For Enhancement
  2. RFE-7439

Additional Prometheus metrics for repo-level push/pull errors, latency tracking, and system availability monitoring

XMLWordPrintable

    • Icon: Feature Request Feature Request
    • Resolution: Unresolved
    • Icon: Critical Critical
    • None
    • None
    • Quay
    • None
    • None
    • Product / Portfolio Work
    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      1. Proposed title of this feature request

      New features for Quay

      2. What is the nature and description of the request?

      Users should be able to get following capability added to Quay

      • Last tag pull time (API)
      • Pruning policy to remove tags older than n number of days since the last tag was pulled
      • Expose the following metrics via Prometheus for our alerting and monitoring needs
       S.No Metric Details
      1 Pull and Push count per repository  This is available but we need this  per repository:
      quay_registry_image_pulls_total
      quay_registry_image_pulled_estimated_bytes_total
      quay_registry_image_pushes_total
      quay_registry_image_pushed_bytes_total
      2  push_errored_count_per_repo Image push failed per repo and total image push failed counts, showing which repos the image pushes failed.
      3  pull_errored_ count_per_repo Image pull failed per repo and total image pull failed counts, showing which repos the image pull failed.
      4 mirror_errored
      mirror_done
      mirror_running
      mirror_waiting
      Mirrors failed per repo and total mirrors failed counts, showing which repos the mirrors failed
      5 latency of image push The amount of time it takes to upload (or push) a container image from a client.
      6 DB not available Need RedHat team to provide metrics to identify PG Database unreachable
      7 S3 not available Need RedHat team to provide metrics to identify S3 storage unreachable
      8 Splunk not available Need RedHat team to provide metrics to identify Splunk unreachable

      3. Why does the customer need this? (List the business requirements here)

      Customer needs these to maintain and monitor their enterprise Quay.

      4. List any affected packages or components.

              rhn-coreos-tunwu Tony Wu
              mp.singh Mahendra Singh
              None
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                None
                None