[cloud-user@jiwei-0930-02-rhel8-mirror ~]$ mkdir work2 [cloud-user@jiwei-0930-02-rhel8-mirror ~]$ cp install-config.yaml work2 [cloud-user@jiwei-0930-02-rhel8-mirror ~]$ ./openshift-install create manifests --dir work2 INFO Credentials loaded from gcloud CLI defaults INFO Consuming Install Config from target directory INFO Manifests created in: work2/manifests and work2/openshift [cloud-user@jiwei-0930-02-rhel8-mirror ~]$ [cloud-user@jiwei-0930-02-rhel8-mirror ~]$ <...manually create the required credentials and then copy the manifests to the dir...> [cloud-user@jiwei-0930-02-rhel8-mirror ~]$ [cloud-user@jiwei-0930-02-rhel8-mirror ~]$ ls work2/manifests/ -lrt total 108 -rw-r-----. 1 cloud-user cloud-user 4345 Sep 30 06:03 openshift-config-secret-pull-secret.yaml -rw-r-----. 1 cloud-user cloud-user 4110 Sep 30 06:03 machine-config-server-tls-secret.yaml -rw-r-----. 1 cloud-user cloud-user 1304 Sep 30 06:03 kube-system-configmap-root-ca.yaml -rw-r-----. 1 cloud-user cloud-user 118 Sep 30 06:03 kube-cloud-config.yaml -rw-r-----. 1 cloud-user cloud-user 200 Sep 30 06:03 cvo-overrides.yaml -rw-r-----. 1 cloud-user cloud-user 171 Sep 30 06:03 cluster-scheduler-02-config.yml -rw-r-----. 1 cloud-user cloud-user 142 Sep 30 06:03 cluster-proxy-01-config.yaml -rw-r-----. 1 cloud-user cloud-user 273 Sep 30 06:03 cluster-network-02-config.yml -rw-r-----. 1 cloud-user cloud-user 9607 Sep 30 06:03 cluster-network-01-crd.yml -rw-r-----. 1 cloud-user cloud-user 218 Sep 30 06:03 cluster-ingress-02-config.yml -rw-r-----. 1 cloud-user cloud-user 684 Sep 30 06:03 cluster-infrastructure-02-config.yml -rw-r-----. 1 cloud-user cloud-user 280 Sep 30 06:03 cluster-dns-02-config.yml -rw-r-----. 1 cloud-user cloud-user 1823 Sep 30 06:03 cluster-config.yaml -rw-r-----. 1 cloud-user cloud-user 520 Sep 30 06:03 cloud-provider-config.yaml -rw-r-----. 1 cloud-user cloud-user 175 Sep 30 06:03 cloud-controller-uid-config.yml -rw-rw-r--. 1 cloud-user cloud-user 3331 Sep 30 06:26 99_openshift-cloud-controller-manager_gcp-ccm-cloud-credentials-secret.yaml -rw-rw-r--. 1 cloud-user cloud-user 3345 Sep 30 06:26 99_openshift-cloud-credential-operator_cloud-credential-operator-gcp-ro-creds-secret.yaml -rw-rw-r--. 1 cloud-user cloud-user 3330 Sep 30 06:26 99_openshift-cloud-network-config-controller_cloud-credentials-secret.yaml -rw-rw-r--. 1 cloud-user cloud-user 3323 Sep 30 06:26 99_openshift-cluster-api_capg-manager-bootstrap-credentials-secret.yaml -rw-rw-r--. 1 cloud-user cloud-user 3321 Sep 30 06:26 99_openshift-cluster-csi-drivers_gcp-pd-cloud-credentials-secret.yaml -rw-rw-r--. 1 cloud-user cloud-user 3327 Sep 30 06:26 99_openshift-image-registry_installer-cloud-credentials-secret.yaml -rw-rw-r--. 1 cloud-user cloud-user 3315 Sep 30 06:26 99_openshift-ingress-operator_cloud-credentials-secret.yaml -rw-rw-r--. 1 cloud-user cloud-user 3314 Sep 30 06:26 99_openshift-machine-api_gcp-cloud-credentials-secret.yaml [cloud-user@jiwei-0930-02-rhel8-mirror ~]$ [cloud-user@jiwei-0930-02-rhel8-mirror ~]$ ./openshift-install create cluster --dir work2 INFO Consuming OpenShift Install (Manifests) from target directory INFO Consuming Worker Machines from target directory INFO Consuming Common Manifests from target directory INFO Consuming Master Machines from target directory INFO Consuming Openshift Manifests from target directory INFO Credentials loaded from gcloud CLI defaults INFO Creating infrastructure resources... INFO Waiting up to 20m0s (until 6:49AM) for the Kubernetes API at https://api.jiwei-0930-03.qe-shared-vpc.qe.gcp.devcluster.openshift.com:6443... INFO API v1.24.0+8c7c967 up INFO Waiting up to 30m0s (until 7:01AM) for bootstrapping to complete... INFO Pulling VM console logs INFO Pulling debug logs from the bootstrap machine ERROR Cluster operator authentication Degraded is True with IngressStateEndpoints_MissingSubsets::OAuthClientsController_SyncError::OAuthServerDeployment_PreconditionNotFulfilled::OAuthServerRouteEndpointAccessibleController_SyncError::OAuthServerServiceEndpointAccessibleController_SyncError::OAuthServerServiceEndpointsEndpointAccessibleController_SyncError::WellKnownReadyController_SyncError: IngressStateEndpointsDegraded: No subsets found for the endpoints of oauth-server ERROR OAuthClientsControllerDegraded: no ingress for host oauth-openshift.apps.jiwei-0930-03.qe-shared-vpc.qe.gcp.devcluster.openshift.com in route oauth-openshift in namespace openshift-authentication ERROR OAuthServerDeploymentDegraded: waiting for the oauth-openshift route to contain an admitted ingress: no admitted ingress for route oauth-openshift in namespace openshift-authentication ERROR OAuthServerDeploymentDegraded: ERROR OAuthServerRouteEndpointAccessibleControllerDegraded: route "openshift-authentication/oauth-openshift": status does not have a valid host address ERROR OAuthServerServiceEndpointAccessibleControllerDegraded: Get "https://172.30.172.249:443/healthz": dial tcp 172.30.172.249:443: connect: connection refused ERROR OAuthServerServiceEndpointsEndpointAccessibleControllerDegraded: oauth service endpoints are not ready ERROR WellKnownReadyControllerDegraded: failed to get oauth metadata from openshift-config-managed/oauth-openshift ConfigMap: configmap "oauth-openshift" not found (check authentication operator, it is supposed to create this) ERROR Cluster operator authentication Available is False with APIServices_Error::OAuthServerDeployment_PreconditionNotFulfilled::OAuthServerServiceEndpointAccessibleController_EndpointUnavailable::OAuthServerServiceEndpointsEndpointAccessibleController_ResourceNotFound::ReadyIngressNodes_NoReadyIngressNodes::WellKnown_NotReady: APIServicesAvailable: "user.openshift.io.v1" is not ready: an attempt failed with statusCode = 503, err = the server is currently unable to handle the request ERROR OAuthServerServiceEndpointAccessibleControllerAvailable: Get "https://172.30.172.249:443/healthz": dial tcp 172.30.172.249:443: connect: connection refused ERROR OAuthServerServiceEndpointsEndpointAccessibleControllerAvailable: endpoints "oauth-openshift" not found ERROR ReadyIngressNodesAvailable: Authentication requires functional ingress which requires at least one schedulable and ready node. Got 0 worker nodes, 3 master nodes, 0 custom target nodes (none are schedulable or ready for ingress pods). ERROR WellKnownAvailable: The well-known endpoint is not yet available: failed to get oauth metadata from openshift-config-managed/oauth-openshift ConfigMap: configmap "oauth-openshift" not found (check authentication operator, it is supposed to create this) INFO Cluster operator baremetal Disabled is True with UnsupportedPlatform: Nothing to do on this Platform INFO Cluster operator cloud-controller-manager TrustedCABundleControllerControllerAvailable is True with AsExpected: Trusted CA Bundle Controller works as expected INFO Cluster operator cloud-controller-manager TrustedCABundleControllerControllerDegraded is False with AsExpected: Trusted CA Bundle Controller works as expected INFO Cluster operator cloud-controller-manager CloudConfigControllerAvailable is True with AsExpected: Cloud Config Controller works as expected INFO Cluster operator cloud-controller-manager CloudConfigControllerDegraded is False with AsExpected: Cloud Config Controller works as expected ERROR Cluster operator cluster-autoscaler Degraded is True with MissingDependency: machine-api not ready ERROR Cluster operator console Degraded is True with DefaultRouteSync_FailedAdmitDefaultRoute::RouteHealth_RouteNotAdmitted::SyncLoopRefresh_FailedIngress: DefaultRouteSyncDegraded: no ingress for host downloads-openshift-console.apps.jiwei-0930-03.qe-shared-vpc.qe.gcp.devcluster.openshift.com in route downloads in namespace openshift-console ERROR RouteHealthDegraded: console route is not admitted ERROR SyncLoopRefreshDegraded: no ingress for host console-openshift-console.apps.jiwei-0930-03.qe-shared-vpc.qe.gcp.devcluster.openshift.com in route console in namespace openshift-console ERROR Cluster operator console Available is False with RouteHealth_RouteNotAdmitted: RouteHealthAvailable: console route is not admitted INFO Cluster operator etcd RecentBackup is Unknown with ControllerStarted: The etcd backup controller is starting, and will decide if recent backups are available or if a backup is required ERROR Cluster operator image-registry Available is False with NoReplicasAvailable: Available: The deployment does not have available replicas ERROR NodeCADaemonAvailable: The daemon set node-ca has available replicas ERROR ImagePrunerAvailable: Pruner CronJob has been created INFO Cluster operator image-registry Progressing is True with DeploymentNotCompleted: Progressing: The deployment has not completed INFO NodeCADaemonProgressing: The daemon set node-ca is deployed ERROR Cluster operator image-registry Degraded is True with Unavailable: Degraded: The deployment does not have available replicas ERROR Cluster operator ingress Available is False with IngressUnavailable: The "default" ingress controller reports Available=False: IngressControllerUnavailable: One or more status conditions indicate unavailable: DeploymentAvailable=False (DeploymentUnavailable: The deployment has Available status condition set to False (reason: MinimumReplicasUnavailable) with message: Deployment does not have minimum availability.) INFO Cluster operator ingress Progressing is True with Reconciling: ingresscontroller "default" is progressing: IngressControllerProgressing: One or more status conditions indicate progressing: DeploymentRollingOut=True (DeploymentRollingOut: Waiting for router deployment rollout to finish: 0 of 2 updated replica(s) are available... INFO ). INFO Not all ingress controllers are available. ERROR Cluster operator ingress Degraded is True with IngressDegraded: The "default" ingress controller reports Degraded=True: DegradedConditions: One or more other status conditions indicate a degraded state: PodsScheduled=False (PodsNotScheduled: Some pods are not scheduled: Pod "router-default-6758bbcfb7-jbtsv" cannot be scheduled: 0/3 nodes are available: 3 node(s) didn't match Pod's node affinity/selector, 3 node(s) had untolerated taint {node-role.kubernetes.io/master: }. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling. Pod "router-default-6758bbcfb7-8nx72" cannot be scheduled: 0/3 nodes are available: 3 node(s) didn't match Pod's node affinity/selector, 3 node(s) had untolerated taint {node-role.kubernetes.io/master: }. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling. Make sure you have sufficient worker nodes.), DeploymentAvailable=False (DeploymentUnavailable: The deployment has Available status condition set to False (reason: MinimumReplicasUnavailable) with message: Deployment does not have minimum availability.), DeploymentReplicasMinAvailable=False (DeploymentMinimumReplicasNotMet: 0/2 of replicas are available, max unavailable is 1), CanaryChecksSucceeding=Unknown (CanaryRouteNotAdmitted: Canary route is not admitted by the default ingress controller) INFO Cluster operator insights ClusterTransferAvailable is False with NoClusterTransfer: no available cluster transfer INFO Cluster operator insights Disabled is False with AsExpected: INFO Cluster operator insights SCAAvailable is True with Updated: SCA certs successfully updated in the etc-pki-entitlement secret INFO Cluster operator kube-apiserver Progressing is True with NodeInstaller: NodeInstallerProgressing: 1 nodes are at revision 5; 2 nodes are at revision 7; 0 nodes have achieved new revision 8 ERROR Cluster operator kube-controller-manager Degraded is True with GarbageCollector_Error: GarbageCollectorDegraded: error fetching rules: Get "https://thanos-querier.openshift-monitoring.svc:9091/api/v1/rules": dial tcp: lookup thanos-querier.openshift-monitoring.svc on 172.30.0.10:53: no such host INFO Cluster operator kube-controller-manager Progressing is True with NodeInstaller: NodeInstallerProgressing: 1 nodes are at revision 6; 2 nodes are at revision 7 INFO Cluster operator kube-scheduler Progressing is True with NodeInstaller: NodeInstallerProgressing: 2 nodes are at revision 6; 1 nodes are at revision 7 INFO Cluster operator machine-api Progressing is True with SyncingResources: Progressing towards operator: 4.12.0-0.nightly-2022-09-28-204419 ERROR Cluster operator machine-api Degraded is True with SyncingFailed: Failed when progressing towards operator: 4.12.0-0.nightly-2022-09-28-204419 because minimum worker replica count (2) not yet met: current running replicas 0, waiting for [jiwei-0930-03-rrhmn-worker-a-4b5n4 jiwei-0930-03-rrhmn-worker-b-bjzkw] ERROR Cluster operator machine-api Available is False with Initializing: Operator is initializing ERROR Cluster operator monitoring Available is False with UpdatingPrometheusOperatorFailed: Rollout of the monitoring stack failed and is degraded. Please investigate the degraded status error. ERROR Cluster operator monitoring Degraded is True with UpdatingPrometheusOperatorFailed: Failed to rollout the stack. Error: updating prometheus operator: reconciling Prometheus Operator Admission Webhook Deployment failed: updating Deployment object failed: waiting for DeploymentRollout of openshift-monitoring/prometheus-operator-admission-webhook: got 2 unavailable replicas INFO Cluster operator monitoring Progressing is True with RollOutInProgress: Rolling out the stack. INFO Cluster operator network ManagementStateDegraded is False with : INFO Cluster operator network Progressing is True with Deploying: Deployment "/openshift-network-diagnostics/network-check-source" is waiting for other operators to become ready ERROR Bootstrap failed to complete: timed out waiting for the condition ERROR Failed to wait for bootstrapping to complete. This error usually happens when there is a problem with control plane hosts that prevents the control plane operators from creating the control plane. INFO Bootstrap gather logs captured here "/home/cloud-user/work2/log-bundle-20220930070100.tar.gz" [cloud-user@jiwei-0930-02-rhel8-mirror ~]$ [cloud-user@jiwei-0930-02-rhel8-mirror ~]$ export KUBECONFIG=work2/auth/kubeconfig [cloud-user@jiwei-0930-02-rhel8-mirror ~]$ ./oc adm must-gather ...... [cloud-user@jiwei-0930-02-rhel8-mirror ~]$ ./oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version False True 52m Unable to apply 4.12.0-0.nightly-2022-09-28-204419: some cluster operators are not available [cloud-user@jiwei-0930-02-rhel8-mirror ~]$ [cloud-user@jiwei-0930-02-rhel8-mirror ~]$ ./oc get nodes NAME STATUS ROLES AGE VERSION jiwei-0930-03-rrhmn-master-0.c.openshift-qe-shared-vpc.internal Ready control-plane,master 47m v1.24.0+8c7c967 jiwei-0930-03-rrhmn-master-1.c.openshift-qe-shared-vpc.internal Ready control-plane,master 47m v1.24.0+8c7c967 jiwei-0930-03-rrhmn-master-2.c.openshift-qe-shared-vpc.internal Ready control-plane,master 47m v1.24.0+8c7c967 [cloud-user@jiwei-0930-02-rhel8-mirror ~]$ [cloud-user@jiwei-0930-02-rhel8-mirror ~]$ ./oc get machines -n openshift-machine-api -o wide NAME PHASE TYPE REGION ZONE AGE NODE PROVIDERID STATE jiwei-0930-03-rrhmn-master-0 Running n2-standard-4 us-west1 us-west1-a 51m jiwei-0930-03-rrhmn-master-0.c.openshift-qe-shared-vpc.internal gce://openshift-qe-shared-vpc/us-west1-a/jiwei-0930-03-rrhmn-master-0 RUNNING jiwei-0930-03-rrhmn-master-1 Running n2-standard-4 us-west1 us-west1-b 51m jiwei-0930-03-rrhmn-master-1.c.openshift-qe-shared-vpc.internal gce://openshift-qe-shared-vpc/us-west1-b/jiwei-0930-03-rrhmn-master-1 RUNNING jiwei-0930-03-rrhmn-master-2 Running n2-standard-4 us-west1 us-west1-c 51m jiwei-0930-03-rrhmn-master-2.c.openshift-qe-shared-vpc.internal gce://openshift-qe-shared-vpc/us-west1-c/jiwei-0930-03-rrhmn-master-2 RUNNING jiwei-0930-03-rrhmn-worker-a-4b5n4 Provisioned n2-standard-4 us-west1 us-west1-a 43m gce://openshift-qe-shared-vpc/us-west1-a/jiwei-0930-03-rrhmn-worker-a-4b5n4 RUNNING jiwei-0930-03-rrhmn-worker-b-bjzkw Provisioned n2-standard-4 us-west1 us-west1-b 43m gce://openshift-qe-shared-vpc/us-west1-b/jiwei-0930-03-rrhmn-worker-b-bjzkw RUNNING [cloud-user@jiwei-0930-02-rhel8-mirror ~]$ [cloud-user@jiwei-0930-02-rhel8-mirror ~]$ ./oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE authentication 4.12.0-0.nightly-2022-09-28-204419 False False True 45m OAuthServer ServiceEndpointAccessibleControllerAvailable: Get "https://172.30.172.249:443/healthz": dial tcp 172.30.172.249:443: connect: connection refused... baremetal 4.12.0-0.nightly-2022-09-28-204419 True False False 44m cloud-controller-manager 4.12.0-0.nightly-2022-09-28-204419 True False False 48m cloud-credential 4.12.0-0.nightly-2022-09-28-204419 True False False 44m cluster-autoscaler True False True 44m machine-api not ready config-operator 4.12.0-0.nightly-2022-09-28-204419 True False False 45m console 4.12.0-0.nightly-2022-09-28-204419 False False True 29m RouteHealthAvailable: console route is not admitted control-plane-machine-set 4.12.0-0.nightly-2022-09-28-204419 True False False 44m csi-snapshot-controller 4.12.0-0.nightly-2022-09-28-204419 True False False 45m dns 4.12.0-0.nightly-2022-09-28-204419 True False False 44m etcd 4.12.0-0.nightly-2022-09-28-204419 True False False 42m image-registry False True True 31m Available: The deployment does not have available replicas... ingress False True True 32m The "default" ingress controller reports Available=False: IngressControllerUnavailable: One or more status conditions indicate unavailable: DeploymentAvailable=False (DeploymentUnavailable: The deployment has Available status condition set to False (reason: MinimumReplicasUnavailable) with message: Deployment does not have minimum availability.) insights 4.12.0-0.nightly-2022-09-28-204419 True False False 39m kube-apiserver 4.12.0-0.nightly-2022-09-28-204419 True False False 39m kube-controller-manager 4.12.0-0.nightly-2022-09-28-204419 True False True 42m GarbageCollectorDegraded: error fetching rules: Get "https://thanos-querier.openshift-monitoring.svc:9091/api/v1/rules": dial tcp: lookup thanos-querier.openshift-monitoring.svc on 172.30.0.10:53: no such host kube-scheduler 4.12.0-0.nightly-2022-09-28-204419 True False False 41m kube-storage-version-migrator 4.12.0-0.nightly-2022-09-28-204419 True False False 45m machine-api False True True 44m Operator is initializing machine-approver 4.12.0-0.nightly-2022-09-28-204419 True False False 44m machine-config 4.12.0-0.nightly-2022-09-28-204419 True False False 44m marketplace 4.12.0-0.nightly-2022-09-28-204419 True False False 44m monitoring False True True 29m Rollout of the monitoring stack failed and is degraded. Please investigate the degraded status error. network 4.12.0-0.nightly-2022-09-28-204419 True True False 48m Deployment "/openshift-network-diagnostics/network-check-source" is waiting for other operators to become ready node-tuning 4.12.0-0.nightly-2022-09-28-204419 True False False 44m openshift-apiserver 4.12.0-0.nightly-2022-09-28-204419 False False False 19m APIServicesAvailable: "apps.openshift.io.v1" is not ready: an attempt failed with statusCode = 503, err = the server is currently unable to handle the request... openshift-controller-manager 4.12.0-0.nightly-2022-09-28-204419 True False False 41m openshift-samples 4.12.0-0.nightly-2022-09-28-204419 True False False 31m operator-lifecycle-manager 4.12.0-0.nightly-2022-09-28-204419 True False False 45m operator-lifecycle-manager-catalog 4.12.0-0.nightly-2022-09-28-204419 True False False 45m operator-lifecycle-manager-packageserver 4.12.0-0.nightly-2022-09-28-204419 True False False 32m service-ca 4.12.0-0.nightly-2022-09-28-204419 True False False 45m storage 4.12.0-0.nightly-2022-09-28-204419 True False False 45m [cloud-user@jiwei-0930-02-rhel8-mirror ~]$