-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
4.12.z, 4.11.z, 4.14
-
Important
-
No
-
False
-
Description of problem:
'OAuthServerRouteEndpointAccessibleControllerAvailable: Get "https://oauth-openshift.apps.skordas-15b.qe.devcluster.openshift.com/healthz": context deadline exceeded (Client.Timeout exceeded while awaiting headers)' after installation on Nutanix
Version-Release number of selected component (if applicable):
4.11.46
How reproducible:
50% So far 2 on 4 attempts.
Steps to Reproduce:
1. Install OCP 4.11.46 on Nutanix Jenkins: https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp-common/job/Flexy-install/225973/ Template used for this installation: https://gitlab.cee.redhat.com/aosqe/flexy-templates/-/blob/master/functionality-testing/aos-4_11/ipi-on-nutanix/versioned-installer-fips-ovn-csi_pvc-ci install-config.yaml apiVersion: v1 controlPlane: architecture: amd64 hyperthreading: Enabled name: master platform: {} replicas: 3 compute: - architecture: amd64 hyperthreading: Enabled name: worker platform: {} replicas: 2 metadata: name: skordas-15b platform: nutanix: apiVIP: 10.0.132.12 ingressVIP: 10.0.132.13 subnetUUIDs: - efe26e93-f6cf-4d89-8104-009e85201fa8 prismCentral: username: sgao password: HIDDEN endpoint: address: prismcentral.lts-cluster.nutanix-dev.devcluster.openshift.com port: 9440 prismElements: - uuid: 0005d9a4-8e4f-7c33-58d1-e9d0e2d48853 endpoint: address: 10.0.128.159 port: 9440 pullSecret: HIDDEN networking: clusterNetwork: - cidr: 10.128.0.0/14 hostPrefix: 23 serviceNetwork: - 172.30.0.0/16 machineNetwork: - cidr: 10.0.0.0/16 networkType: OVNKubernetes publish: External credentialsMode: Manual fips: true baseDomain: qe.devcluster.openshift.com sshKey: SSH-KEY
Actual results:
$ oc get co authentication -o yaml status: conditions: - lastTransitionTime: "2023-08-15T17:55:33Z" message: |- APIServerDeploymentDegraded: 1 of 3 requested instances are unavailable for apiserver.openshift-oauth-apiserver () OAuthServerDeploymentDegraded: 1 of 3 requested instances are unavailable for oauth-openshift.openshift-authentication () OAuthServerRouteEndpointAccessibleControllerDegraded: Get "https://oauth-openshift.apps.skordas-15b.qe.devcluster.openshift.com/healthz": context deadline exceeded (Client.Timeout exceeded while awaiting headers) reason: APIServerDeployment_UnavailablePod::OAuthServerDeployment_UnavailablePod::OAuthServerRouteEndpointAccessibleController_SyncError status: "True" type: Degraded - lastTransitionTime: "2023-08-15T17:53:14Z" message: 'AuthenticatorCertKeyProgressing: All is well' reason: AsExpected status: "False" type: Progressing - lastTransitionTime: "2023-08-15T17:53:33Z" message: 'OAuthServerRouteEndpointAccessibleControllerAvailable: Get "https://oauth-openshift.apps.skordas-15b.qe.devcluster.openshift.com/healthz": context deadline exceeded (Client.Timeout exceeded while awaiting headers)' reason: OAuthServerRouteEndpointAccessibleController_EndpointUnavailable status: "False" type: Available - lastTransitionTime: "2023-08-15T17:28:03Z" message: All is well reason: AsExpected status: "True" type: Upgradeable
Additional info:
I got this issue trying to gather: $ oc adm must-gather [must-gather ] OUT Using must-gather plug-in image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e6341fd84317f92e74e494a78b8c3a12f576bfcfc4827f4cc7f49da358539eb3 When opening a support case, bugzilla, or issue please include the following summary data along with any other requested information: ClusterID: fd096799-a755-4fdb-8632-9e8087da3a1e ClusterVersion: Stable at "4.11.46" ClusterOperators: clusteroperator/authentication is not available (OAuthServerRouteEndpointAccessibleControllerAvailable: Get "https://oauth-openshift.apps.skordas-15b.qe.devcluster.openshift.com/healthz": context deadline exceeded (Client.Timeout exceeded while awaiting headers)) because APIServerDeploymentDegraded: 1 of 3 requested instances are unavailable for apiserver.openshift-oauth-apiserver () OAuthServerDeploymentDegraded: 1 of 3 requested instances are unavailable for oauth-openshift.openshift-authentication () OAuthServerRouteEndpointAccessibleControllerDegraded: Get "https://oauth-openshift.apps.skordas-15b.qe.devcluster.openshift.com/healthz": context deadline exceeded (Client.Timeout exceeded while awaiting headers) clusteroperator/ingress is degraded because The "default" ingress controller reports Degraded=True: DegradedConditions: One or more other status conditions indicate a degraded state: CanaryChecksSucceeding=False (CanaryChecksRepetitiveFailures: Canary route checks for the default ingress controller are failing) clusteroperator/machine-config is degraded because Failed to resync 4.11.46 because: error during syncRequiredMachineConfigPools: [timed out waiting for the condition, error required pool master is not ready, retrying. Status: (total: 3, ready 2, updated: 2, unavailable: 1, degraded: 0)] clusteroperator/openshift-apiserver is degraded because APIServerDeploymentDegraded: 1 of 3 requested instances are unavailable for apiserver.openshift-apiserver ()