Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Normal
Fix Version/s: None
Affects Version/s: 4.12.z, 4.11.z, 4.14
Component/s: Installer / Nutanix
Labels:
- rollback
- upgrade

Severity:
Important
Regression:
No
Blocked:
False
Blocked Reason:

Hide

None

Show
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Description of problem:

'OAuthServerRouteEndpointAccessibleControllerAvailable: Get "https://oauth-openshift.apps.skordas-15b.qe.devcluster.openshift.com/healthz":
      context deadline exceeded (Client.Timeout exceeded while awaiting headers)'

after installation on Nutanix

Version-Release number of selected component (if applicable):

4.11.46

How reproducible:

50% So far 2 on 4 attempts.

Steps to Reproduce:

1. Install OCP 4.11.46 on Nutanix 
Jenkins: https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp-common/job/Flexy-install/225973/
Template used for this installation: https://gitlab.cee.redhat.com/aosqe/flexy-templates/-/blob/master/functionality-testing/aos-4_11/ipi-on-nutanix/versioned-installer-fips-ovn-csi_pvc-ci

install-config.yaml

 apiVersion: v1
 controlPlane:
   architecture: amd64
   hyperthreading: Enabled
   name: master
   platform: {}
   replicas: 3
 compute:
 - architecture: amd64
   hyperthreading: Enabled
   name: worker
   platform: {}
   replicas: 2
 metadata:
   name: skordas-15b
 platform:
   nutanix:
     apiVIP: 10.0.132.12
     ingressVIP: 10.0.132.13
     subnetUUIDs:
     - efe26e93-f6cf-4d89-8104-009e85201fa8
     prismCentral:
       username: sgao
       password: HIDDEN
       endpoint:
         address: prismcentral.lts-cluster.nutanix-dev.devcluster.openshift.com
         port: 9440
     prismElements:
     - uuid: 0005d9a4-8e4f-7c33-58d1-e9d0e2d48853
       endpoint:
         address: 10.0.128.159
         port: 9440
 pullSecret: HIDDEN
 networking:
   clusterNetwork:
   - cidr: 10.128.0.0/14
     hostPrefix: 23
   serviceNetwork:
   - 172.30.0.0/16
   machineNetwork:
   - cidr: 10.0.0.0/16
   networkType: OVNKubernetes
 publish: External
 credentialsMode: Manual
 fips: true
 baseDomain: qe.devcluster.openshift.com
 sshKey: SSH-KEY

Actual results:

$ oc get co authentication -o yaml


status:
  conditions:
  - lastTransitionTime: "2023-08-15T17:55:33Z"
    message: |-
      APIServerDeploymentDegraded: 1 of 3 requested instances are unavailable for apiserver.openshift-oauth-apiserver ()
      OAuthServerDeploymentDegraded: 1 of 3 requested instances are unavailable for oauth-openshift.openshift-authentication ()
      OAuthServerRouteEndpointAccessibleControllerDegraded: Get "https://oauth-openshift.apps.skordas-15b.qe.devcluster.openshift.com/healthz": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
    reason: APIServerDeployment_UnavailablePod::OAuthServerDeployment_UnavailablePod::OAuthServerRouteEndpointAccessibleController_SyncError
    status: "True"
    type: Degraded
  - lastTransitionTime: "2023-08-15T17:53:14Z"
    message: 'AuthenticatorCertKeyProgressing: All is well'
    reason: AsExpected
    status: "False"
    type: Progressing
  - lastTransitionTime: "2023-08-15T17:53:33Z"
    message: 'OAuthServerRouteEndpointAccessibleControllerAvailable: Get "https://oauth-openshift.apps.skordas-15b.qe.devcluster.openshift.com/healthz":
      context deadline exceeded (Client.Timeout exceeded while awaiting headers)'
    reason: OAuthServerRouteEndpointAccessibleController_EndpointUnavailable
    status: "False"
    type: Available
  - lastTransitionTime: "2023-08-15T17:28:03Z"
    message: All is well
    reason: AsExpected
    status: "True"
    type: Upgradeable

Additional info:

I got this issue trying to gather:

$ oc adm must-gather
[must-gather      ] OUT Using must-gather plug-in image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e6341fd84317f92e74e494a78b8c3a12f576bfcfc4827f4cc7f49da358539eb3
When opening a support case, bugzilla, or issue please include the following summary data along with any other requested information:
ClusterID: fd096799-a755-4fdb-8632-9e8087da3a1e
ClusterVersion: Stable at "4.11.46"
ClusterOperators:
	clusteroperator/authentication is not available (OAuthServerRouteEndpointAccessibleControllerAvailable: Get "https://oauth-openshift.apps.skordas-15b.qe.devcluster.openshift.com/healthz": context deadline exceeded (Client.Timeout exceeded while awaiting headers)) because APIServerDeploymentDegraded: 1 of 3 requested instances are unavailable for apiserver.openshift-oauth-apiserver ()
OAuthServerDeploymentDegraded: 1 of 3 requested instances are unavailable for oauth-openshift.openshift-authentication ()
OAuthServerRouteEndpointAccessibleControllerDegraded: Get "https://oauth-openshift.apps.skordas-15b.qe.devcluster.openshift.com/healthz": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
	clusteroperator/ingress is degraded because The "default" ingress controller reports Degraded=True: DegradedConditions: One or more other status conditions indicate a degraded state: CanaryChecksSucceeding=False (CanaryChecksRepetitiveFailures: Canary route checks for the default ingress controller are failing)
	clusteroperator/machine-config is degraded because Failed to resync 4.11.46 because: error during syncRequiredMachineConfigPools: [timed out waiting for the condition, error required pool master is not ready, retrying. Status: (total: 3, ready 2, updated: 2, unavailable: 1, degraded: 0)]
	clusteroperator/openshift-apiserver is degraded because APIServerDeploymentDegraded: 1 of 3 requested instances are unavailable for apiserver.openshift-apiserver ()

Assignee:: Yanhua Li

Reporter:: Simon Kordas

QA Contact:: Shang Gao

Votes:: 0 Vote for this issue

Watchers:: 14 Start watching this issue

Created:: 2023/08/15 8:08 PM

Updated:: 2024/06/11 2:30 AM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates