Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Major
Fix Version/s: None
Affects Version/s: 4.15.0
Component/s: Node Tuning Operator
Labels:
None

Severity:
Important
Regression:
No
Release Blocker:
Proposed
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Latest Status Summary:
20/11 : main issue should be solved by the OCP blocking issue (node not ready) . Profile degradation will probably have to wait for the RHEL blocking bug attached here.
RH Private Keywords:

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:
PX Priority Data:

Description of problem:

When deploying CNF workers where a PerformanceProfile is applied, the profile can't be applied and the worker node is never ready.

Version-Release number of selected component (if applicable):

current master (4.15)

How reproducible:

apiVersion: v1
items:
- apiVersion: performance.openshift.io/v2
  kind: PerformanceProfile
  metadata:
    creationTimestamp: "2023-10-25T18:48:39Z"
    finalizers:
    - foreground-deletion
    generation: 1
    name: cnf-performanceprofile
    resourceVersion: "32192"
    uid: 93d463ac-6412-4a9a-ac81-5a4dba4c9730
  spec:
    additionalKernelArgs:
    - nmi_watchdog=0
    - audit=0
    - mce=off
    - processor.max_cstate=1
    - idle=poll
    - intel_idle.max_cstate=0
    - amd_iommu=on
    cpu:
      isolated: 2-7
      reserved: 0-1
    globallyDisableIrqLoadBalancing: true
    hugepages:
      defaultHugepagesSize: 1G
      pages:
      - count: 4
        node: 0
        size: 1G
    nodeSelector:
      node-role.kubernetes.io/worker: ""
    realTimeKernel:
      enabled: false

Steps to Reproduce:

1. Deploy a cluster with 3 masters and zero worker.
2. Create a PerformanceProfile as above.
3. Scale the workers to 1

Actual results:

The worker node will never be ready, cluster-node-tuning-operator fails to apply the PerformanceProfile with this error:

I1025 18:56:02.899138       1 controller.go:820] created MachineConfig 50-nto-worker with kernel parameters: [skew_tick=1 tsc=reliable rcupdate.rcu_normal_after_boot=1 nohz=on rcu_nocbs=2-7 tuned.non_isolcpus=00000003 systemd.cpu_affinity=0,1 intel_iommu=on iommu=pt isolcpus=managed_irq,2-7 nohz_full=2-7 tsc=reliable nosoftlockup nmi_watchdog=0 mce=off skew_tick=1 rcutree.kthread_prio=11 default_hugepagesz=1G nmi_watchdog=0 audit=0 mce=off processor.max_cstate=1 idle=poll intel_idle.max_cstate=0 amd_iommu=on intel_pstate=disable]
I1025 18:56:02.900768       1 status.go:306] 1/4 Profiles failed to be applied



crio fails with this error:

level=error msg="Container creation error: time=\"2023-10-25T19:10:06Z\" level=error msg=\"runc create failed: unable to start container process: unable to apply cgroup con
figuration: failed to write \\\"0-7\\\": write /sys/fs/cgroup/cpuset/system.slice/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podef062c82f679a1ef42be9dd35e115e2f.slice/crio-c8300b86016fe34637684039adf9eef3d4f04a411c5d13ba4c464c91589551ee.scope/cpuset.cpus: pe
rmission denied\"\n" id=44a49756-5a89-4c16-acea-56961a74f1ab name=/runtime.v1.RuntimeService/CreateContainer

Expected results:

The PerformanceProfile should be applied (like before, it worked) and the Worker node to be Ready.

Additional info:

Example of CI job: https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_sriov-network-operator/844/pull-ci-openshift-sriov-network-operator-master-e2e-openstack-nfv/1717172583696175104

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

crio
909 kB
2023/10/25 7:27 PM
kubelet
2.26 MB
2023/10/25 7:27 PM
openshift-cluster-node-tuning-operator-cluster-node-tuning-operator-6dc4b49c87-hmnr5-1698261378018441869.log
12 kB
2023/10/25 7:27 PM

is blocked by

OCPBUGS-20492 crun not respecting cpu-quota:disable (or cpu-load-balancing:disable) annotations correctly

Closed

is related to

RHEL-11342 tuned.utils.commands: Writing to file '/sys/block/dm-2/queue/read_ahead_kb' error: '[Errno 2] No such file or directory: '/sys/block/dm-2/queue/read_ahead_kb''

Closed

Assignee:: Yanir Quinn

Reporter:: Emilien Macchi

QA Contact:: Mallapadi Niranjan

Votes:: 0 Vote for this issue

Watchers:: 10 Start watching this issue

Created:: 2023/10/25 7:24 PM

Updated:: 2023/12/18 2:16 PM

Resolved:: 2023/12/05 11:23 AM

Details

Description

Attachments

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates