Loading...

XML

Word

Printable

Type: Bug
Resolution: Cannot Reproduce
Priority: Undefined
Fix Version/s: None
Affects Version/s: 4.14
Component/s: Installer / openshift-installer
Labels:
None

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
None
Regression:
No

Target Backport Versions:
None
Target Version:

4.14.0
Release Blocker:
None
Sprint:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

Installation fails as cluster operators are not stable.

Reprinting Cluster State:
When opening a support case, bugzilla, or issue please include the following summary data along with any other requested information:
ClusterID: 807269fa-0d64-48f9-817b-00d3799f67eb
ClusterVersion: Installing "4.14.0-0.nightly-2023-09-20-033502" for 3 hours: Unable to apply 4.14.0-0.nightly-2023-09-20-033502: some cluster operators are not available
ClusterOperators:
    clusteroperator/config-operator is not upgradeable because FeatureGatesUpgradeable: "TechPreviewNoUpgrade" does not allow updates
    clusteroperator/image-registry is degraded because Degraded: Registry deployment has timed out progressing: ReplicaSet "image-registry-7d8667cdf7" has timed out progressing.
    clusteroperator/ingress is degraded because The "default" ingress controller reports Degraded=True: DegradedConditions: One or more other status conditions indicate a degraded state: DeploymentReplicasAllAvailable=False (DeploymentReplicasNotAvailable: 1/2 of replicas are available)
    clusteroperator/kube-apiserver is not upgradeable because FeatureGatesUpgradeable: "TechPreviewNoUpgrade" does not allow updates
    clusteroperator/kube-controller-manager is degraded because GarbageCollectorDegraded: error fetching rules: Get "https://thanos-querier.openshift-monitoring.svc:9091/api/v1/rules": dial tcp: lookup thanos-querier.openshift-monitoring.svc on 172.30.0.10:53: no such host
    clusteroperator/machine-config is not upgradeable because One or more machine config pools are updating, please see `oc get mcp` for further details
    clusteroperator/monitoring is not available (reconciling Prometheus Operator Admission Webhook Deployment failed: updating Deployment object failed: waiting for DeploymentRollout of openshift-monitoring/prometheus-operator-admission-webhook: context deadline exceeded) because reconciling Prometheus Operator Admission Webhook Deployment failed: updating Deployment object failed: waiting for DeploymentRollout of openshift-monitoring/prometheus-operator-admission-webhook: context deadline exceeded
    clusteroperator/storage is not available (SHARESCSIDriverOperatorCRAvailable: SharedResourcesDriverNodeServiceControllerAvailable: Waiting for the DaemonSet to deploy the CSI Node Service) because GCPPDCSIDriverOperatorCRDegraded: All is well
SHARESCSIDriverOperatorCRDegraded: All is well

Version-Release number of selected component (if applicable):

4.14

How reproducible:

1 of 2 attempts fail

Steps to Reproduce:

1. Install GCP cluster with latest builld
2.
3.

Actual results:

Cluster install fails

Expected results:

Cluster install should succeed every time

Additional info:

The cluster is created with feature_set: "TechPreviewNoUpgrade"

relates to

OCPBUGS-19568 [gcp] installation with "featureSet: TechPreviewNoUpgrade" failed, possibly due to nodes getting taint - "node.kubernetes.io/network-unavailable"

Closed

Assignee:: Brent Barbachem

Reporter:: Arti Sood

Need Info From:: None

Contributors:: None

QA Contact:: Jianli Wei

Doc Contact:: None

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Created:: 2023/09/21 3:19 PM

Updated:: 2025/07/25 11:44 AM

Resolved:: 2023/10/03 12:30 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates