Loading...

XML

Word

Printable

Type: Bug
Resolution: Done-Errata
Priority: Undefined
Fix Version/s: None
Affects Version/s: 4.14.z
Component/s: Performance Addon Operator
Labels:
- perfscale-telco-5g

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
None
Regression:
No

Target Backport Versions:

4.14.z
Target Version:

4.14.z
Release Blocker:
None
Sprint:
None

RH Private Keywords:

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Test Coverage:

-

PX Priority Data:
PX Impact Score:

Release Note Status:
Done
Release Note Type:
Bug Fix
Release Note Text:

Hide
Previously, after you performed an EUS-to-EUS update on your {product-title} cluster that involved pausing and unpausing the machine config pool, two reboot operations occured after the unpause operation. This additional reboot was not expected and was caused by the performance profile controller being reconciled against an older `MachineConfig` that is listed in the `MachineConfigPool`. With this release, the performance profile controller reconciles against the latest `MachineConfig` that is listed in the `MachineConfigPool` so that the extra reboot does not occur. (link:https://issues.redhat.com/browse/OCPBUGS-32980[*~~OCPBUGS-32980~~*])

Show
Previously, after you performed an EUS-to-EUS update on your {product-title} cluster that involved pausing and unpausing the machine config pool, two reboot operations occured after the unpause operation. This additional reboot was not expected and was caused by the performance profile controller being reconciled against an older `MachineConfig` that is listed in the `MachineConfigPool`. With this release, the performance profile controller reconciles against the latest `MachineConfig` that is listed in the `MachineConfigPool` so that the extra reboot does not occur. (link: https://issues.redhat.com/browse/OCPBUGS-32980 [* OCPBUGS-32980 *])

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

When a PerformanceProfile is applied to a minor version upgrade and the worker mcp paused and then resumed at target version. The worker nodes go thru two reboots rendering multiple worker mc configs. With a default upgrade ( no PerformanceProfle) only the expected one reboot is observed.

Version-Release number of selected component (if applicable):

How reproducible

    100%

Steps to Reproduce:

    1.Create PerfProfile at pre upgrade 4.14 release
    2.pause worker mcp
    3.Upgrade to target version
    4. Resume MCP

Actual results:

    workers need 2 reboots

Expected results:

    One reboot

Additional info:

    apiVersion: performance.openshift.io/v2
kind: PerformanceProfile
metadata:
  name: perf-profile-2m-worker
spec:
  cpu:
    reserved: 0-3
    isolated: 4-63
  workloadHints:
    realTime: false
  hugepages:
    defaultHugepagesSize: "2M"
    pages:
    - size: "2M"
      count: 24000
      node: 0
    - size: "2M"
      count: 24000
      node: 1
  realTimeKernel:
    enabled: false
  numa:
    topologyPolicy: "best-effort"
  net:
    userLevelNetworking: false
  nodeSelector:
    node-role.kubernetes.io/worker: ""

depends on

OCPBUGS-32978 [release-4.15]Extra reboot with performance profile on 4.14 when mcp worker resumes with upgrade

Closed

is duplicated by

OCPBUGS-32966 [BM] OCP 4.14 Rendered creation delay

Closed

links to

openshift/cluster-node-tuning-operator#1053: OCPBUGS-32980: [release-4.14]:fix extra-reboot on upgrade with paused mcp worker

RHBA-2024:2789 OpenShift Container Platform 4.14.z bug fix update

Assignee:: Vitaly Grinberg

Reporter:: Dave Wilson

QA Contact:: Niranjan Mallapadi Raghavendra Rao

Need Info From:: None

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Created:: 2024/04/25 2:12 PM

Updated:: 2025/09/13 2:23 PM

Resolved:: 2024/05/16 9:51 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates