Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-8475

TestBoundTokenSignerController causes unrecoverable disruption in e2e-gcp-operator CI job

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • Blocker
    • None
    • 4.13
    • kube-apiserver
    • Critical
    • No
    • Proposed
    • False
    • Hide

      None

      Show
      None
    • N/A
    • Bug Fix

    Description

      The cluster-kube-apiserver-operator CI has been constantly failing for the past week and more specifically the e2e-gcp-operator job because the test cluster ends in a state where a lot of requests start failing with "Unauthorized" errors.

      This caused multiple operators to become degraded and tests to fail.

      https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_cluster-kube-apiserver-operator/1450/pull-ci-openshift-cluster-kube-apiserver-operator-master-e2e-gcp-operator/1631333936435040256

      Looking at the failures and a must-gather we were able to capture inside of a test cluster, it turned out that the service account issuer could be the culprit here. Because of that we opened https://issues.redhat.com/browse/API-1549.

      However, it turned that disabling TestServiceAccountIssuer didn't resolve the issue and the cluster was still too unstable for the tests to pass.

      In a separate attempt we also tried disabling TestBoundTokenSignerController and this time the tests were passing. However, the cluster was still very unstable during the e2e run and the kube-apiserver-operator went degraded a couple of times: https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_cluster-kube-apiserver-operator/1455/pull-ci-openshift-cluster-kube-apiserver-operator-master-e2e-gcp-operator/1632871645171421184/artifacts/e2e-gcp-operator/gather-extra/artifacts/pods/openshift-kube-apiserver-operator_kube-apiserver-operator-5cf9d4569-m2spq_kube-apiserver-operator.log.

      On top of that instead of seeing Unauthorized errors, we are now seeing a lot of connection refused.

      Attachments

        Issue Links

          Activity

            People

              dgrisonn@redhat.com Damien Grisonnet
              dgrisonn@redhat.com Damien Grisonnet
              Rahul Gangwar Rahul Gangwar
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: