Resolution: Cannot Reproduce
Description of problem:
Installation fails as cluster operators are not stable. Reprinting Cluster State: When opening a support case, bugzilla, or issue please include the following summary data along with any other requested information: ClusterID: 807269fa-0d64-48f9-817b-00d3799f67eb ClusterVersion: Installing "4.14.0-0.nightly-2023-09-20-033502" for 3 hours: Unable to apply 4.14.0-0.nightly-2023-09-20-033502: some cluster operators are not available ClusterOperators: clusteroperator/config-operator is not upgradeable because FeatureGatesUpgradeable: "TechPreviewNoUpgrade" does not allow updates clusteroperator/image-registry is degraded because Degraded: Registry deployment has timed out progressing: ReplicaSet "image-registry-7d8667cdf7" has timed out progressing. clusteroperator/ingress is degraded because The "default" ingress controller reports Degraded=True: DegradedConditions: One or more other status conditions indicate a degraded state: DeploymentReplicasAllAvailable=False (DeploymentReplicasNotAvailable: 1/2 of replicas are available) clusteroperator/kube-apiserver is not upgradeable because FeatureGatesUpgradeable: "TechPreviewNoUpgrade" does not allow updates clusteroperator/kube-controller-manager is degraded because GarbageCollectorDegraded: error fetching rules: Get "https://thanos-querier.openshift-monitoring.svc:9091/api/v1/rules": dial tcp: lookup thanos-querier.openshift-monitoring.svc on no such host clusteroperator/machine-config is not upgradeable because One or more machine config pools are updating, please see `oc get mcp` for further details clusteroperator/monitoring is not available (reconciling Prometheus Operator Admission Webhook Deployment failed: updating Deployment object failed: waiting for DeploymentRollout of openshift-monitoring/prometheus-operator-admission-webhook: context deadline exceeded) because reconciling Prometheus Operator Admission Webhook Deployment failed: updating Deployment object failed: waiting for DeploymentRollout of openshift-monitoring/prometheus-operator-admission-webhook: context deadline exceeded clusteroperator/storage is not available (SHARESCSIDriverOperatorCRAvailable: SharedResourcesDriverNodeServiceControllerAvailable: Waiting for the DaemonSet to deploy the CSI Node Service) because GCPPDCSIDriverOperatorCRDegraded: All is well SHARESCSIDriverOperatorCRDegraded: All is well
Version-Release number of selected component (if applicable):
How reproducible:
1 of 2 attempts fail
Steps to Reproduce:
1. Install GCP cluster with latest builld 2. 3.
Actual results:
Cluster install fails
Expected results:
Cluster install should succeed every time
Additional info:
The cluster is created with feature_set: "TechPreviewNoUpgrade"
- relates to
OCPBUGS-19568 [gcp] installation with "featureSet: TechPreviewNoUpgrade" failed, possibly due to nodes getting taint - "node.kubernetes.io/network-unavailable"
- Closed