Uploaded image for project: 'OCP Technical Release Team'
  1. OCP Technical Release Team
  2. TRT-1608

Hypershift Conformance Monitoring Node 500 Error Pathological Events

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Blocker Blocker
    • None
    • None
    • False
    • None
    • False

      Starting with 4.16.0-0.nightly-2024-04-20-191906 we see hypershift failures due to openshift-monitoring pathological events. Not that featuregate for metrics server was enabled for default. Will test a revert

      : [sig-arch] events should not repeat pathologically for ns/openshift-monitoring
      { 1 events happened too frequently

      event happened 37 times, something is wrong: namespace/openshift-monitoring node/ip-10-0-141-162.ec2.internal pod/metrics-server-fc95746d9-fmq4f hmsg/5a32f618fb - reason/ProbeError Readiness probe error: HTTP probe failed with statuscode: 500 result=reject
      body: [+]ping ok
      [+]log ok
      [+]poststarthook/max-in-flight-filter ok
      [+]poststarthook/storage-object-count-tracker-hook ok
      [-]metric-storage-ready failed: reason withheld
      [+]metric-informer-sync ok
      [+]metadata-informer-sync ok
      [+]shutdown ok
      readyz check failed

            Unassigned Unassigned
            rh-ee-fbabcock Forrest Babcock
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: