Uploaded image for project: 'OCP Technical Release Team'
  1. OCP Technical Release Team
  2. TRT-2225

Switch etcd log monitoring to a rate, not a fixed count

XMLWordPrintable

    • Icon: Story Story
    • Resolution: Done
    • Icon: Major Major
    • None
    • None
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • False
    • None
    • None
    • None
    • None

      Via OCPBUGS-52968, lets switch the test

      [sig-etcd] etcd should not log excessive took too long messages

      To fail if over a rate limit, not a flat 10k. We think longer job runs are tripping it way more often on azure as test suite grows. (though average job run time is not massively different, +10-15m)

      For the rate limit, lets say 3.5hours would allow 10k, so approx 10000 = 210 = around 50 per minute. If we're over that for the job run, fail the test. The monitortest can record it's startup time, and it knows it's NEAR the end of the job when the analysis happens.

      Also, lets output the rate in an autodl artifact, so we can pull it into bigquery and do better analysis on the result.

      Backport to 4.19, so we know can check if the bug above is a release blocker or not.

              rhn-engineering-dgoodwin Devan Goodwin
              rhn-engineering-dgoodwin Devan Goodwin
              None
              None
              None
              None
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: