[OCPBUGS-27335] Installation fails with 1 master and 2 workers as the console deployment set the number of replicas based on the InfrastructureTopology rather than the ControlPlaneTopology

Type: Bug
Resolution: Done-Errata
Priority: Undefined
Fix Version/s: 4.16.0
Affects Version/s: 4.12.z
Component/s: Management Console
Labels:
None

Severity:
Moderate
Regression:
No
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Release Note Text:
N/A
Release Note Type:
Release Note Not Required
Target Version:

4.16.0

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Description of problem:

The node selector for the console deployment requires deploying it on the master nodes, The node selector for the console deployment requires deploying it on the master nodes, while the replica count is determined by the infrastructureTopology, which primarily tracks the workers' setup.

When an OpenShift cluster is installed with a single master node and multiple workers, this leads the console deployment to request 2 replicas as infrastructureTopology is set to HighlyAvailable. Instead, ControlPlaneTopology is set to SingleReplica as expected.

Version-Release number of selected component (if applicable):

4.16

How reproducible:

Always

Steps to Reproduce:

    1. Install an openshift cluster with 1 master and 2 workers

Actual results:

The installation fails as the replicas for the console deployment is set to 2.

  apiVersion: config.openshift.io/v1
  kind: Infrastructure
  metadata:
    creationTimestamp: "2024-01-18T08:34:47Z"
    generation: 1
    name: cluster
    resourceVersion: "517"
    uid: d89e60b4-2d9c-4867-a2f8-6e80207dc6b8
  spec:
    cloudConfig:
      key: config
      name: cloud-provider-config
    platformSpec:
      aws: {}
      type: AWS
  status:
    apiServerInternalURI: https://api-int.adstefa-a12.qe.devcluster.openshift.com:6443
    apiServerURL: https://api.adstefa-a12.qe.devcluster.openshift.com:6443
    controlPlaneTopology: SingleReplica
    cpuPartitioning: None
    etcdDiscoveryDomain: ""
    infrastructureName: adstefa-a12-6wlvm
    infrastructureTopology: HighlyAvailable
    platform: AWS
    platformStatus:
      aws:
        region: us-east-2
      type: AWS


apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
   .... 
  creationTimestamp: "2024-01-18T08:54:23Z"
  generation: 3
  labels:
    app: console
    component: ui
  name: console
  namespace: openshift-console
spec:
  progressDeadlineSeconds: 600
  replicas: 2

Expected results:

The replica is set to 1, tracking the ControlPlaneTopology value instead of hte infrastructureTopology.

Additional info:

is cloned by

OCPBUGS-31619 Installation fails with 1 master and 2 workers as the console deployment set the number of replicas based on the InfrastructureTopology rather than the ControlPlaneTopology

Closed

is depended on by

OCPBUGS-31619 Installation fails with 1 master and 2 workers as the console deployment set the number of replicas based on the InfrastructureTopology rather than the ControlPlaneTopology

Closed

links to

openshift/console-operator#838: OCPBUGS-27335: The console-deployment should set the number of replicas based on the ControlPlaneTopology

openshift/console-operator#841: OCPBUGS-27335: use InfrastructureTopology for clusters using external CP as the console deploys on the worker nodes

openshift/console-operator#882: OCPBUGS-27335: use InfrastructureTopology for clusters using external CP as the console deploys on the worker nodes

RHEA-2024:0041 OpenShift Container Platform 4.16.z bug fix update

(1 links to)

Errata Tool added a comment - 2024/06/27 11:36 AM

Since the problem described in this issue should be resolved in a recent advisory, it has been closed.

For information on the advisory (Critical: OpenShift Container Platform 4.16.0 bug fix and security update), and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHSA-2024:0041

Errata Tool added a comment - 2024/06/27 11:36 AM Since the problem described in this issue should be resolved in a recent advisory, it has been closed. For information on the advisory (Critical: OpenShift Container Platform 4.16.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2024:0041

Jakub Hadvig added a comment - 2024/04/02 10:11 AM

Setting the 'Affects Version' to 4.12.z since this issues is affecting previous versions as well.

https://issues.redhat.com/browse/OCPBUGS-31502

Jakub Hadvig added a comment - 2024/04/02 10:11 AM Setting the 'Affects Version' to 4.12.z since this issues is affecting previous versions as well. https://issues.redhat.com/browse/OCPBUGS-31502

Yanping Zhang added a comment - 2024/02/01 1:13 PM

Tested with payload 4.16.0-0.ci-2024-01-31-151542.
1. Launch a normal cluster with 1 master node and 2 worker nodes. Check console operator, deployment and pods, replica is set as 1 and console pod number is 1.
2. Launch hypershift cluster, set hosted cluster with one worker node, Check console operator, deployment and pods, replica is set as 1 and console pod number is 1.
The bug is fixed.

Yanping Zhang added a comment - 2024/02/01 1:13 PM Tested with payload 4.16.0-0.ci-2024-01-31-151542. 1. Launch a normal cluster with 1 master node and 2 worker nodes. Check console operator, deployment and pods, replica is set as 1 and console pod number is 1. 2. Launch hypershift cluster, set hosted cluster with one worker node, Check console operator, deployment and pods, replica is set as 1 and console pod number is 1. The bug is fixed.

Alessandro Di Stefano added a comment - 2024/01/23 8:20 AM - edited

Hi yanpzhan1 can you send the result of the following in the case of hypershift (in the guest/hosted cluster)?

oc get infrastructure -o yaml

Alessandro Di Stefano added a comment - 2024/01/23 8:20 AM - edited Hi yanpzhan1 can you send the result of the following in the case of hypershift (in the guest/hosted cluster)? oc get infrastructure -o yaml

Yanping Zhang added a comment - 2024/01/23 7:51 AM

Using payload 4.16.0-0.nightly-2024-01-21-154905 to launch hypershift cluster, and configure hosted cluster only one worker node. Check on hosted cluster, the console operator is abnormal, and console deployment has replicas "2", but only one pod is in Running status.

# oc get node
NAME                          STATUS   ROLES    AGE   VERSION
ip-10-0-142-37.ec2.internal   Ready    worker   61m   v1.29.0+f629574
# oc get co console
NAME      VERSION                              AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
console   4.16.0-0.nightly-2024-01-21-154905   True        True          False      60m     SyncLoopRefreshProgressing: Working toward version 4.16.0-0.nightly-2024-01-21-154905, 1 replicas available
# oc get node
NAME                          STATUS   ROLES    AGE   VERSION
ip-10-0-142-37.ec2.internal   Ready    worker   63m   v1.29.0+f629574
# oc -n openshift-console get pod
NAME                         READY   STATUS    RESTARTS   AGE
console-544dc8c7d-klhc9      0/1     Pending   0          60m
console-54bdd888cf-4wsc9     1/1     Running   0          61m
console-5cdb687998-qg8p9     0/1     Pending   0          61m
downloads-5bcd554dbf-nxvmx   1/1     Running   0          61m
downloads-5bcd554dbf-wv4c4   0/1     Pending   0          61m
# oc get deployment console -n openshift-console -ojsonpath='{.spec.replicas}'
2
# oc get deployment downloads -n openshift-console -ojsonpath='{.spec.replicas}'
2
# oc-n openshift-console get pod'
NAME                         READY   STATUS    RESTARTS   AGE
console-544dc8c7d-klhc9      0/1     Pending   0          74m
console-54bdd888cf-4wsc9     1/1     Running   0          75m
console-5cdb687998-qg8p9     0/1     Pending   0          75m
downloads-5bcd554dbf-nxvmx   1/1     Running   0          75m
downloads-5bcd554dbf-wv4c4   0/1     Pending   0          75m
[root@MiWiFi-RB03-srv ~]# oc get deploy -n openshift-console
NAME        READY   UP-TO-DATE   AVAILABLE   AGE
console     1/2     1            1           76m
downloads   1/2     2            1           76m

The fix in pr is not suitable for hosted cluster, since console pods are deployed on worker nodes on this kind of cluster.
rhn-support-adistefa could you help to consider the fix for condition when cluster is hosted cluster?

Yanping Zhang added a comment - 2024/01/23 7:51 AM Using payload 4.16.0-0.nightly-2024-01-21-154905 to launch hypershift cluster, and configure hosted cluster only one worker node. Check on hosted cluster, the console operator is abnormal, and console deployment has replicas "2", but only one pod is in Running status. # oc get node NAME STATUS ROLES AGE VERSION ip-10-0-142-37.ec2.internal Ready worker 61m v1.29.0+f629574 # oc get co console NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE console 4.16.0-0.nightly-2024-01-21-154905 True True False 60m SyncLoopRefreshProgressing: Working toward version 4.16.0-0.nightly-2024-01-21-154905, 1 replicas available # oc get node NAME STATUS ROLES AGE VERSION ip-10-0-142-37.ec2.internal Ready worker 63m v1.29.0+f629574 # oc -n openshift-console get pod NAME READY STATUS RESTARTS AGE console-544dc8c7d-klhc9 0/1 Pending 0 60m console-54bdd888cf-4wsc9 1/1 Running 0 61m console-5cdb687998-qg8p9 0/1 Pending 0 61m downloads-5bcd554dbf-nxvmx 1/1 Running 0 61m downloads-5bcd554dbf-wv4c4 0/1 Pending 0 61m # oc get deployment console -n openshift-console -ojsonpath= '{.spec.replicas}' 2 # oc get deployment downloads -n openshift-console -ojsonpath= '{.spec.replicas}' 2 # oc-n openshift-console get pod' NAME READY STATUS RESTARTS AGE console-544dc8c7d-klhc9 0/1 Pending 0 74m console-54bdd888cf-4wsc9 1/1 Running 0 75m console-5cdb687998-qg8p9 0/1 Pending 0 75m downloads-5bcd554dbf-nxvmx 1/1 Running 0 75m downloads-5bcd554dbf-wv4c4 0/1 Pending 0 75m [root@MiWiFi-RB03-srv ~]# oc get deploy -n openshift-console NAME READY UP-TO-DATE AVAILABLE AGE console 1/2 1 1 76m downloads 1/2 2 1 76m The fix in pr is not suitable for hosted cluster, since console pods are deployed on worker nodes on this kind of cluster. rhn-support-adistefa could you help to consider the fix for condition when cluster is hosted cluster?

Yanping Zhang added a comment - 2024/01/23 3:51 AM - edited

Using payload 4.16.0-0.nightly-2024-01-21-154905 to launch a normal cluster with 1 master node and 2 worker nodes, the cluster is launched successfully. Check console operator, deployment and pods, replica is set as 1 and pod number is 1. They work as expected:
$ oc get co console
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
console 4.16.0-0.nightly-2024-01-21-154905 True False False 27m
$ oc get deployment console -n openshift-console -ojsonpath='

{.spec.replicas}

'
1
$ oc get pod -n openshift-console
NAME READY STATUS RESTARTS AGE
console-5ffbb8644-lpnt5 1/1 Running 0 25m
downloads-7597c9f7c-62sxs 1/1 Running 0 33m

Yanping Zhang added a comment - 2024/01/23 3:51 AM - edited Using payload 4.16.0-0.nightly-2024-01-21-154905 to launch a normal cluster with 1 master node and 2 worker nodes, the cluster is launched successfully. Check console operator, deployment and pods, replica is set as 1 and pod number is 1. They work as expected: $ oc get co console NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE console 4.16.0-0.nightly-2024-01-21-154905 True False False 27m $ oc get deployment console -n openshift-console -ojsonpath=' {.spec.replicas} ' 1 $ oc get pod -n openshift-console NAME READY STATUS RESTARTS AGE console-5ffbb8644-lpnt5 1/1 Running 0 25m downloads-7597c9f7c-62sxs 1/1 Running 0 33m

Alessandro Di Stefano added a comment - 2024/01/18 5:39 PM

Hi jhadvig@redhat.com, are you ok with backporting the fix for a few releases?

Alessandro Di Stefano added a comment - 2024/01/18 5:39 PM Hi jhadvig@redhat.com , are you ok with backporting the fix for a few releases?

OpenShift Jira Bot added a comment - 2024/01/18 10:13 AM

Looks like this bug is far enough along in the workflow that a code fix is ready. Customers and support need to know the backport plan. Please complete the "Target Backport Versions" field to indicate which version(s) will receive the fix.

OpenShift Jira Bot added a comment - 2024/01/18 10:13 AM Looks like this bug is far enough along in the workflow that a code fix is ready. Customers and support need to know the backport plan. Please complete the " Target Backport Versions " field to indicate which version(s) will receive the fix.

Assignee:: Alessandro Di Stefano

Reporter:: Alessandro Di Stefano

QA Contact:: Yanping Zhang

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Created:: 2024/01/18 10:06 AM

Updated:: 2024/06/27 11:36 AM

Resolved:: 2024/06/27 11:36 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

Collapse comment: Errata Tool added a comment - 2024/06/27 11:36 AM

Expand comment: Errata Tool added a comment - 2024/06/27 11:36 AM

Collapse comment: Jakub Hadvig added a comment - 2024/04/02 10:11 AM

Expand comment: Jakub Hadvig added a comment - 2024/04/02 10:11 AM

Collapse comment: Yanping Zhang added a comment - 2024/02/01 1:13 PM

Expand comment: Yanping Zhang added a comment - 2024/02/01 1:13 PM

Collapse comment: Alessandro Di Stefano added a comment - 2024/01/23 8:20 AM, Edited by Alessandro Di Stefano - 2024/01/23 8:20 AM

Expand comment: Alessandro Di Stefano added a comment - 2024/01/23 8:20 AM, Edited by Alessandro Di Stefano - 2024/01/23 8:20 AM

Collapse comment: Yanping Zhang added a comment - 2024/01/23 7:51 AM

Expand comment: Yanping Zhang added a comment - 2024/01/23 7:51 AM

Collapse comment: Yanping Zhang added a comment - 2024/01/23 3:51 AM, Edited by Yanping Zhang - 2024/01/23 7:39 AM

Expand comment: Yanping Zhang added a comment - 2024/01/23 3:51 AM, Edited by Yanping Zhang - 2024/01/23 7:39 AM

Collapse comment: Alessandro Di Stefano added a comment - 2024/01/18 5:39 PM

Expand comment: Alessandro Di Stefano added a comment - 2024/01/18 5:39 PM

Collapse comment: OpenShift Jira Bot added a comment - 2024/01/18 10:13 AM

Expand comment: OpenShift Jira Bot added a comment - 2024/01/18 10:13 AM

People

Dates