Uploaded image for project: 'OCP Technical Release Team'
  1. OCP Technical Release Team
  2. TRT-1270

Prometheus artifacts are often corrupted

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • None
    • None
    • None
    • False
    • None
    • False

      Even after the 6 vcpu gcp bump, we seem to often have corrupted blocks when we try to load in promecieus, attempting recently I found a good 4/5 jobs even successful ones, on both 4.14 and 4.13, appear to have corrupted prom data.

      We grab this data with this manner:

      https://github.com/openshift/release/blob/master/ci-operator/step-registry/gather/extra/gather-extra-commands.sh#L238-L239

      Is this a safe operation or are we doing something silly causing these corrupted archives?

      May want to loop in monitoring team for help here.

      Could be some correlation to CPU use but unclear, do aws archives tend to work better?

              stbenjam Stephen Benjamin
              rhn-engineering-dgoodwin Devan Goodwin
              Ayoub Mrini, Jan Fajerski, Simon Pasquier
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: