Uploaded image for project: 'OpenShift Over the Air'
  1. OpenShift Over the Air
  2. OTA-1175

Address the inconsistency between unavailabe/degraded summary and smoothing-over blips

XMLWordPrintable

    • Icon: Story Story
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • None
    • None
    • None
    • BU Product Work
    • False
    • None
    • False
    • OCPSTRAT-1823 - [GA] 'oc adm upgrade status' command and status API

      OTA-1087 implemented basic health insights section for oc adm upgrade status and used degraded/unavailable ClusterOperators as a proof-of-concept source of upgrade health insights. We decided to make the health section less sensitive to temporary CO conditions, and only show an insight if the condition persists over certain duration threshold. This leads to an inconsistency like the following:

      = Control Plane =
      ...
      Operator Status: 	33 Total, 32 Available, 1 Progressing, 4 Degraded
      
      = Update Health =
      <empty because COs are only degraded/unavailable shortly>
      

      Without knowledge of the underlying decision, a user could wonder why they sometimes see an insight when there is a degraded/unavailable CO and sometimes they do not.

      Some ideas:
      1. Insight messages could mention the threshold ("...unavailable longer than $LIMIT")
      2. Show an Impact=None Level=Info insight with an message saying something like "...is degraded briefly, waiting for it to resolve, no action needed"
      3. Stop smoothing over blips and start reporting immediately
      4. Smooth over blips in the summary (do not count a CO as degraded/unavailable unless it meets the criteria for emitting an insight)

              Unassigned Unassigned
              afri@afri.cz Petr Muller
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: