Uploaded image for project: 'OpenShift Monitoring'
  1. OpenShift Monitoring
  2. MON-2125

Pull configuration changes from remote endpoint PoC

XMLWordPrintable

    • Icon: Epic Epic
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • None
    • CMO
    • remote write config
    • False
    • False
    • NEW
    • To Do
    • MON-3155Insights through metric telemetry
    • NEW
    • 0% To Do, 100% In Progress, 0% Done

      Motivation

      we have discussed a feature to remotely control the telemetry a cluster sends for two years now. Context is below in the Links section.

      We finally want to create a demo proof of concept in order to get organization level support.

      Epic Goal

      Write code in CMO that regularly queries a remote endpoint (this will be setup in telemeter, url can be hard coded) and retrieve a list of remote write configs and append them to the local prometheus setup.

      Communication will be secured by self signed, hardcoded TLS certificates (possibly mTLS?).

      The main goal is to demonstrate this to the org, this does not have to be production level code.

      Why is this important?

      • Some configuration options should not be exposed to users, e.g. the telemetry metrics and the channel via which those metrics are sent. These are currently hard wired.
      • Red Hat control over some of these setting is desirable however (see also Scenarios):
        • A more dynamic control over some settings would improve our ability to provide services and debug issues.
        • We could reduce development time for some features, like https://issues.redhat.com/browse/MON-2119
        • We could get early feedback/telemetry for new features or larger changes for faster proofs of concept.

      Scenarios

      1. We want to change the channel telemeter gets data from clusters (see https://issues.redhat.com/browse/MON-2119 for more details). Currently we can only change this channel for all clusters that update to a new release. Any estimation error made when assessing additional load on our endpoints or similar can only be fixed through a quick follow up release. Any old releases will not be able to change, requiring the maintenance of both channels.
      2. We realize that having a new metric in telemeter would be helpful. This can only be rolled out for a new release.
      3. Same as 2. but with a useless metric. Cluster will send this metric until they upgrade to a new version that drops it.

      Links

      1. https://docs.google.com/document/d/1GPKls4njRPRYfpEusfwHaOHKbKp4J1FPHpQAjdUItWw/edit
      2. https://docs.google.com/document/d/1XGk2rLsuvwDplI9hIqRwAMuj9AiCjkIlTmFaEp-ddcA/edit#heading=h.vrpxxfmir85n 

              Unassigned Unassigned
              jfajersk@redhat.com Jan Fajerski
              Junqi Zhao Junqi Zhao
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: