Loading...

XML

Word

Printable

Type: Bug
Resolution: Won't Do
Priority: Minor
Fix Version/s: None
Affects Version/s: 4.12
Component/s: Installer / openshift-installer
Labels:
None

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
None
Regression:
None

Target Backport Versions:
None
Target Version:
None
Release Blocker:
None
Sprint:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

Multiple operators are reported as degraded during install on baremetal, specifically when using m6g.metal instance types for worker nodes.

Version-Release number of selected component (if applicable):

4.12.0-0.nightly-arm64-2022-11-06-054834

How reproducible:

Create a cluster using m6gd.metal for master nodes and m6g.metal for worker nodes and notice the errors reported during install as installation fails.

Steps to Reproduce:

1. 
2.
3.

Actual results:

Installation fails

Expected results:

Installation succeeds

Additional info:

11-07 17:27:37.103  level=error msg=Cluster operator ingress Degraded is True with IngressDegraded: The "default" ingress controller reports Degraded=True: DegradedConditions: One or more other status conditions indicate a degraded state: PodsScheduled=False (PodsNotScheduled: Some pods are not scheduled: Pod "router-default-774d47b77f-6pvqn" cannot be scheduled: 0/4 nodes are available: 1 node(s) didn't match pod anti-affinity rules, 3 node(s) didn't match Pod's node affinity/selector, 3 node(s) had untolerated taint {node-role.kubernetes.io/master: }. preemption: 0/4 nodes are available: 1 node(s) didn't match pod anti-affinity rules, 3 Preemption is not helpful for scheduling. Make sure you have sufficient worker nodes.)
...
11-07 17:27:37.103  level=info msg=Cluster operator insights UploadDegraded is True with NotAuthorized: Reporting was not allowed: your Red Hat account is not enabled for remote support or your token has expired: UHC services authentication failed
11-07 17:27:37.103  level=info
11-07 17:27:37.103  level=error msg=Cluster operator kube-controller-manager Degraded is True with GarbageCollector_Error: GarbageCollectorDegraded: error fetching rules: Get "https://thanos-querier.openshift-monitoring.svc:9091/api/v1/rules": dial tcp: lookup thanos-querier.openshift-monitoring.svc on 172.30.0.10:53: no such host
11-07 17:27:37.104  level=info msg=Cluster operator machine-api Progressing is True with SyncingResources: Progressing towards operator: 4.12.0-0.nightly-arm64-2022-11-06-054834
11-07 17:27:37.104  level=error msg=Cluster operator machine-api Degraded is True with SyncingFailed: Failed when progressing towards operator: 4.12.0-0.nightly-arm64-2022-11-06-054834 because minimum worker replica count (2) not yet met: current running replicas 1, waiting for [sv-m6g-bm-trial2-ccz68-worker-us-east-2b-jn89g sv-m6g-bm-trial2-ccz68-worker-us-east-2c-c5vsb]
11-07 17:27:37.104  level=error msg=Cluster operator machine-api Available is False with Initializing: Operator is initializing
11-07 17:27:37.104  level=error msg=Cluster operator monitoring Available is False with UpdatingPrometheusOperatorFailed: reconciling Prometheus Operator Admission Webhook Deployment failed: updating Deployment object failed: waiting for DeploymentRollout of openshift-monitoring/prometheus-operator-admission-webhook: got 1 unavailable replicas
11-07 17:27:37.104  level=error msg=Cluster operator monitoring Degraded is True with UpdatingPrometheusOperatorFailed: reconciling Prometheus Operator Admission Webhook Deployment failed: updating Deployment object failed: waiting for DeploymentRollout of openshift-monitoring/prometheus-operator-admission-webhook: got 1 unavailable replicas
11-07 17:27:37.104  level=info msg=Cluster operator monitoring Progressing is True with RollOutInProgress: Rolling out the stack.
11-07 17:27:37.104  level=info msg=Cluster operator network ManagementStateDegraded is False with : 
11-07 17:27:37.104  level=error msg=Cluster initialization failed because one or more operators are not functioning properly.
11-07 17:27:37.104  level=error msg=The cluster should be accessible for troubleshooting as detailed in the documentation linked below,
11-07 17:27:37.104  level=error msg=https://docs.openshift.com/container-platform/latest/support/troubleshooting/troubleshooting-installations.html
11-07 17:27:37.104  level=error msg=The 'wait-for install-complete' subcommand can then be used to continue the installation
11-07 17:27:37.104  level=error msg=failed to initialize the cluster: Cluster operators machine-api, monitoring are not available

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

log-bundle-20221107182214.tar.gz
9.16 MB
2022/11/08 4:58 PM
openshift_install.rtf
544 kB
2022/11/07 11:42 PM

Assignee:: Rafael Fonseca dos Santos

Reporter:: Sharada Vetsa

QA Contact:: Alessandro Di Stefano

Need Info From:: None

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Created:: 2022/11/07 11:39 PM

Updated:: 2025/07/28 11:33 PM

Resolved:: 2023/03/10 8:09 PM

Details

Description

Attachments

Attachments

Easy Agile Planning Poker

Activity

People

Dates