-
Bug
-
Resolution: Unresolved
-
Critical
-
None
-
4.18.z, 4.19.z, 4.20
-
None
-
Quality / Stability / Reliability
-
False
-
-
3
-
Important
-
None
-
None
-
Rejected
-
None
-
In Progress
-
Release Note Not Required
-
None
-
None
-
None
-
None
-
None
Description of problem:
OCP 4.18+ | Node Tuning Operator is marked as degraded during IPI wait-for-install process
Version-Release number of selected component (if applicable):
Appeared in latest releases from OCP 4.18/19/20; mostly nightly builds, but we've seen this in a RC (4.19.0 rc.2) and a GA (4.18.14) releases.
How reproducible:
The issue is not appearing all the times
Steps to Reproduce:
1. Deploy OCP using IPI installer in baremetal nodes (3 master and 4 worker nodes) 2. Wait for bootstrap check passes (using /usr/local/bin/openshift-install --dir /home/kni/clusterconfigs --log-level debug wait-for bootstrap-complete) 3. However, wait for install check (using /usr/local/bin/openshift-install --dir /home/kni/clusterconfigs --log-level debug wait-for install-complete) didn't pass and the cluster reports issues related to Node Tuning Operator degradation.
Actual results:
We can see different issues related to NTO, depending on the case. The issues are related to the default profile that is installed when creating the operator (no custom Tuned profile has been applied). We have captured these two cases: 1) The NTO keeps forever in progressing status, waiting for 1/7 profiles to be applied. This log message is printed from openshift-install's output: level=debug msg=Cluster Operator node-tuning is Progressing=True LastTransitionTime=2025-05-15 00:02:53 -0500 CDT DurationSinceTransition=2334s Reason=ProfileProgressing Message=Waiting for 1/7 Profiles to be applied 2) In other cases, the NTO directly reports that is in degraded status, saying that there are some profiles with bootcmdline conflict (which reminds to OCPBUGS-47729): level=error msg=Cluster operator node-tuning Degraded is True with ProfileConflict: 2/7 Profiles with bootcmdline conflict
Expected results:
NTO should be installed correctly and without problems.
Additional info:
We have launched all deployments with Distributed-CI (DCI). Here we have the jobs where this issue has been observed, for each of the cases reported above: 1) NTO in progressing status - appeared in 4.18-19 - OpenShift 4.18 nightly 2025-05-14 07:59 - https://www.distributed-ci.io/jobs/8b584162-a714-48fa-b548-b6778c85373a/jobStates?sort=date&task=723dfec0-b79a-4152-9920-83ba336ec4ed - OpenShift 4.18 nightly 2025-05-15 13:07 - https://www.distributed-ci.io/jobs/ed91b74d-f60a-4289-a40e-57eedac7a5f1/jobStates?sort=date&task=6c6d40b0-260c-4cbc-9f36-42ec034c1517 - OpenShift 4.19.0 rc.2 - https://www.distributed-ci.io/jobs/83f3586f-4439-4f46-80c1-8bb05ae18537/jobStates?sort=date&task=54b1d013-17cf-449f-ac43-57757a18c579 - OpenShift 4.18.14 - https://www.distributed-ci.io/jobs/bd4b6e8f-b694-4056-ba66-16b817a359e5/jobStates?sort=date&task=ca45c507-7e5b-45c9-b4fc-82f70962a9fa 2) NTO in degraded status with bootcmdline conflict issue - appeared in 4.19-20 [must-gather available] - OpenShift 4.20 nightly 2025-05-15 18:06 - https://www.distributed-ci.io/jobs/698236c0-6622-453a-b399-df446478daff/jobStates?sort=date&task=8a5c4a85-6021-4c1f-a87f-3ceeb369e80e - OpenShift 4.19 nightly 2025-05-18 11:00 - https://www.distributed-ci.io/jobs/16049a57-2544-4e2d-b4d6-5dcd65f2d608/jobStates?sort=date&task=813853e1-129c-4c7b-85be-4c28087d75e1 [must-gather available] - OpenShift 4.20 nightly 2025-05-18 20:09 - https://www.distributed-ci.io/jobs/0a475951-5a7f-40d6-a78c-f3fc44fb9462/jobStates?sort=date&task=8b4c070b-f7eb-4a0f-a555-660cbd11a370
- is caused by
-
RHEL-88238 Please add option which will wait for udev settle during startup
-
- Release Pending
-
- is related to
-
OCPBUGS-47729 OCP 4.17+ | Node Tuning Operator got degraded when creating a PerformanceProfile with "Profiles with bootcmdline conflict" error message
-
- Verified
-
-
OCPBUGS-56767 Node Tuning operator fails to start on one or two nodes
-
- Closed
-
-
OCPBUGS-56528 tuned pods fail : OSError: Error while polling fd: 4
-
- Closed
-
- relates to
-
OCPBUGS-57470 OCP 4.18.17+ Tuned profile is marked as degraded on reboot
-
- New
-
-
OCPBUGS-41934 tuned profile got degraded after node reboot
-
- POST
-
- mentioned on