Description of problem:
This is similar to HOSTEDCP-672 but the hostname for OAuth is not defined in the current config.
Create a HostedCluster on AWS with "PublicAndPrivate" endpoint access and the following configuration:
The oauth route should be externally accessible, however it's not exposed.
services: - service: APIServer servicePublishingStrategy: type: LoadBalancer - service: OAuthServer servicePublishingStrategy: type: Route - service: Konnectivity servicePublishingStrategy: type: Route - service: Ignition servicePublishingStrategy: type: Route
There are a couple of issues:
1) The OAuth public route is admitted by the "private" router but the Route doesn't get routerCanonicalHostname in the status. This was earlier resolved in HOSTEDCP-672 but got reintroduced later after some refactorings.
2) The public route is admitted by the private router even though the DNS record is not registered in external-dns (in hypershift namespace) so the URL is not properly resolved by the private router. The route URL (which looks like this: oauth-clusters-hc1.apps.mgencur-mgmt.mgencur.hypershift.devcluster.openshift.com) can only be routed by the router from the management cluster. There's an A DNS record in AWS "*apps.mgencur-mgmt.mgencur.hypershift.devcluster.openshift.com" which points to the default router of the management cluster. The route should be admitted by the private router only when external-dns is used (i.e. hostname is defined for the OAuth route in the HC config)
Version-Release number of selected component (if applicable):
4.19
How reproducible:
Always
Steps to Reproduce:
1.Create a HostedCluster with the configuration above
Actual results:
The console ClusterOperator gets stuck:
ᐅ oc get co console NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE console 4.19.0-0.nightly-multi-2025-05-27-033815 False True True 7m33s DeploymentAvailable: 0 replicas available for console deployment...
The console pod in openshift-console is not running:
ᐅ oc get pods -n openshift-console NAME READY STATUS RESTARTS AGE console-645598cb65-ln549 0/1 Running 1 (87s ago) 6m28s console-6c4bf8cc5c-ttxrm 0/1 Running 0 3m50s
It logs the following error:
W0527 10:01:44.477840 1 authoptions.go:112] Flag inactivity-timeout is set to less then 300 seconds and will be ignored! I0527 10:01:44.518839 1 envvar.go:172] "Feature gate default state" feature="ClientsAllowCBOR" enabled=false I0527 10:01:44.518862 1 envvar.go:172] "Feature gate default state" feature="ClientsPreferCBOR" enabled=false I0527 10:01:44.518869 1 envvar.go:172] "Feature gate default state" feature="InformerResourceVersion" enabled=false I0527 10:01:44.518876 1 envvar.go:172] "Feature gate default state" feature="WatchListClient" enabled=false F0527 10:06:44.519958 1 authoptions.go:332] Error initializing authenticator: failed to construct OAuth endpoint cache: failed to setup an async cache - caching func returned error: context deadline exceeded
The console fails to access the OAuth route, checked this manually:
ᐅ curl -v -k https://oauth-clusters-hc1.apps.mgencur-mgmt.mgencur.hypershift.devcluster.openshift.com * Host oauth-clusters-hc1.apps.mgencur-mgmt.mgencur.hypershift.devcluster.openshift.com:443 was resolved. * IPv6: (none) * IPv4: 44.208.177.198, 52.45.137.74 * Trying 44.208.177.198:443... * Connected to oauth-clusters-hc1.apps.mgencur-mgmt.mgencur.hypershift.devcluster.openshift.com (44.208.177.198) port 443 * ALPN: curl offers h2,http/1.1 * TLSv1.3 (OUT), TLS handshake, Client hello (1): * TLSv1.3 (IN), TLS handshake, Server hello (2): * TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8): * TLSv1.3 (IN), TLS handshake, Certificate (11): * TLSv1.3 (IN), TLS handshake, CERT verify (15): * TLSv1.3 (IN), TLS handshake, Finished (20): * TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1): * TLSv1.3 (OUT), TLS handshake, Finished (20): * SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256 / x25519 / RSASSA-PSS * ALPN: server did not agree on a protocol. Uses default. * Server certificate: * subject: O=openshift; CN=openshift-ingress * start date: May 29 10:32:56 2025 GMT * expire date: May 29 10:32:56 2026 GMT * issuer: OU=openshift; CN=root-ca * SSL certificate verify result: unable to get local issuer certificate (20), continuing anyway. * Certificate level 0: Public key type RSA (2048/112 Bits/secBits), signed using sha256WithRSAEncryption * using HTTP/1.x > GET / HTTP/1.1 > Host: oauth-clusters-hc1.apps.mgencur-mgmt.mgencur.hypershift.devcluster.openshift.com > User-Agent: curl/8.6.0 > Accept: */* > * TLSv1.3 (IN), TLS handshake, Newsession Ticket (4): * TLSv1.3 (IN), TLS handshake, Newsession Ticket (4): * old SSL session ID is stale, removing * HTTP 1.0, assume close after body < HTTP/1.0 503 Service Unavailable
The route looks like this:
ᐅ oc get route oauth -oyaml apiVersion: route.openshift.io/v1 kind: Route metadata: annotations: openshift.io/host.generated: "true" creationTimestamp: "2025-05-29T11:01:10Z" labels: hypershift.openshift.io/hosted-control-plane: clusters-hc1 name: oauth namespace: clusters-hc1 ownerReferences: - apiVersion: hypershift.openshift.io/v1beta1 blockOwnerDeletion: true controller: true kind: HostedControlPlane name: hc1 uid: 3b4bd993-c269-4a46-a36b-909aaed2a1af resourceVersion: "16805" uid: 19457574-f5b0-4626-854f-2b2219a67dfa spec: host: oauth-clusters-hc1.apps.mgencur-mgmt.mgencur.hypershift.devcluster.openshift.com tls: insecureEdgeTerminationPolicy: None termination: passthrough to: kind: Service name: oauth-openshift weight: 100 wildcardPolicy: None status: ingress: - conditions: - lastTransitionTime: "2025-05-29T11:01:35Z" status: "True" type: Admitted host: oauth-clusters-hc1.apps.mgencur-mgmt.mgencur.hypershift.devcluster.openshift.com routerName: router wildcardPolicy: None
The service is there:
ᐅ oc get svc oauth-openshift -oyaml apiVersion: v1 kind: Service metadata: creationTimestamp: "2025-05-29T11:01:10Z" name: oauth-openshift namespace: clusters-hc1 ownerReferences: - apiVersion: hypershift.openshift.io/v1beta1 blockOwnerDeletion: true controller: true kind: HostedControlPlane name: hc1 uid: 3b4bd993-c269-4a46-a36b-909aaed2a1af resourceVersion: "16565" uid: 491e8c75-bf05-4737-8a58-6f608b73f302 spec: clusterIP: 172.31.190.108 clusterIPs: - 172.31.190.108 internalTrafficPolicy: Cluster ipFamilies: - IPv4 ipFamilyPolicy: PreferDualStack ports: - port: 6443 protocol: TCP targetPort: 6443 selector: app: oauth-openshift hypershift.openshift.io/control-plane-component: oauth-openshift sessionAffinity: None type: ClusterIP status: loadBalancer: {}
Accessing the Service from a sibling Pod works (see below). It just doesn't work when using the Route.
Calling curl from a sibling Pod can reach the Pod.
/home/curl_user $ curl -k https://172.31.190.108:6443 { "kind": "Status", "apiVersion": "v1", "metadata": {}, "status": "Failure", "message": "forbidden: User \"system:anonymous\" cannot get path \"/\"", "reason": "Forbidden", "details": {}, "code": 403 }
Expected results:
The hosted cluster starts successfully
Additional info:
- blocks
-
OCPBUGS-61407 OAuth Route not working using PublicAndPrivate access endpoint and no hostname defined
-
- POST
-
- is cloned by
-
OCPBUGS-61407 OAuth Route not working using PublicAndPrivate access endpoint and no hostname defined
-
- POST
-
- links to