Uploaded image for project: 'OpenShift Cloud'
  1. OpenShift Cloud
  2. OCPCLOUD-1661

Investigate reporting on expected versus observed replicas for MachineSets

XMLWordPrintable

    • Icon: Spike Spike
    • Resolution: Won't Do
    • Icon: Undefined Undefined
    • None
    • None
    • None
    • False
    • None
    • False

      during the investigation and discussion of Bug 2104511 , there was some discussion about possible monitoring or alerting around the notion of expected replicas versus observed replicas for a MachineSet.

      this investigation should examine the possibility of exporting metrics based on the replicas that a MachineSet has currently and the number of Machines that actually exist. using these metrics we can start to create profiles about the average times and behaviors of scaling operations.

      another perspective on this is creating alerting around situations where the observed replicas are taking a long time to reach the expected counts. although we have errors conditions for Machines with no running phases and Machines with no Nodes, this alert could detect conditions where a Machine object is never created.

      For reference about this issue please read this thread https://coreos.slack.com/archives/CBZHF4DHC/p1660837393467059

              Unassigned Unassigned
              mimccune@redhat.com Michael McCune
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: