Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-51109

Insights operator regressed due to bad port concat on ipv6

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Critical
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      (Feel free to update this bug's summary to be more specific.)
      Component Readiness has found a potential regression in the following test:

      [sig-cluster-lifecycle] pathological event should not see excessive Back-off restarting failed containers for ns/openshift-insights

      Extreme regression detected.
      Fishers Exact probability of a regression: 100.00%.
      Test pass rate dropped from 100.00% to 59.21%.
      Overrode base stats using release 4.17

      Sample (being evaluated) Release: 4.19
      Start Time: 2025-02-13T00:00:00Z
      End Time: 2025-02-20T12:00:00Z
      Success Rate: 59.21%
      Successes: 45
      Failures: 31
      Flakes: 0

      Base (historical) Release: 4.17
      Start Time: 2024-09-01T00:00:00Z
      End Time: 2024-10-01T00:00:00Z
      Success Rate: 100.00%
      Successes: 275
      Failures: 0
      Flakes: 0

      View the test details report for additional context.

      The insights operator is failing to start on ipv6 jobs which are disconnected from the internet.

      namespace/openshift-insights node/worker-0.ostest.test.metalkube.org pod/insights-runtime-extractor-hxb2d uid/b198d4a9-4080-497b-99d4-455eda361d08 container/kube-rbac-proxy restarted 9 times at:
      non-zero exit at 2025-02-20 07:30:37.81265889 +0000 UTC m=+350.448894783: cause/Error code/1 reason/ContainerExit I0220 07:30:36.489734   52004 kube-rbac-proxy.go:532] Reading config file: /etc/kube-rbac-proxy/config.yaml
      I0220 07:30:36.491491   52004 kube-rbac-proxy.go:235] Valid token audiences: 
      I0220 07:30:36.535006   52004 kube-rbac-proxy.go:349] Reading certificate files
      I0220 07:30:36.535531   52004 kube-rbac-proxy.go:397] Starting TCP socket on fd01:0:0:5::7:8000
      I0220 07:30:36.536152   52004 kube-rbac-proxy.go:495] received interrupt, shutting down
      E0220 07:30:36.536241   52004 run.go:72] "command failed" err="failed to run groups: failed to listen on secure address: listen tcp: address fd01:0:0:5::7:8000: too many colons in address"
      

      Looks like a straightfoward ipv6 bug with port usage. stbenjam has fixed issues with this in the past and this go linter repo should provide some info on best fix: https://github.com/stbenjam/no-sprintf-host-port

      Also affects tests

      [sig-cluster-lifecycle] pathological event should not see excessive Back-off restarting failed containers for ns/openshift-insights

      [sig-arch] events should not repeat pathologically for ns/openshift-insights

              jmesnil1@redhat.com Jeff Mesnil
              rhn-engineering-dgoodwin Devan Goodwin
              None
              None
              Baiyang Zhou Baiyang Zhou
              None
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: