Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Normal
Fix Version/s: 4.14
Affects Version/s: 4.14
Component/s: Node / Kubelet
Labels:
- pre-merge-verify-node
- triaged

Regression:
No
Sprint:
OCPNODE Sprint 240 (Blue)
sprint_count:
1
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Target Version:

4.14.0

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Description of problem:

I created a cluster with _workerLatencyProfile: LowUpdateSlowReaction_, then I edited the latencyProfile to MediumUpdateAverageReaction using documentation linked and this test case document below. Once I switched I waited for KubeControllerManager and KubeAPIServer to stop progressing/complete and noticed the nodeStatusUpdateFrequency under /etc/kubernetes/kubelet.conf does not change as expected

https://docs.google.com/document/d/19dPIE4WFxVc3ldu-hNoXiOkjBCQrHC6I7wfyaUyTDqw/edit#heading=h.kf4qxogy9r6
Version-Release number of selected component (if applicable):

4.14.0-0.nightly-2023-07-31-181848

How reproducible:

100%

Steps to Reproduce:

1. Create cluster with LowUpdateSlowReaction manifest: Example: https://docs.google.com/document/d/19dPIE4WFxVc3ldu-hNoXiOkjBCQrHC6I7wfyaUyTDqw/edit#heading=h.22najgyaj9lh
2. Validate values of low update profile components 

$ oc debug node/<worker-node-name>
$ chroot /host 
$ sh-4.4# cat /etc/kubernetes/kubelet.conf | grep nodeStatusUpdateFrequency 
  "nodeStatusUpdateFrequency": "1m0s",
$ oc get KubeControllerManager -o yaml | grep -A 1 node-monitor
        node-monitor-grace-period:
        - 5m0s
$ oc get KubeAPIServer -o yaml | grep -A 1 default-
        default-not-ready-toleration-seconds:
        - "60"
        Default-unreachable-toleration-seconds:
        - "60"
3. *oc edit nodes.config/cluster*
spec: 
  workerLatencyProfile: MediumUpdateAverageReaction
4. Wait for components to complete using 

oc get KubeControllerManager -o yaml | grep -i workerlatency -A 5 -B 5
and 
oc get KubeAPIServer -o yaml | grep -i workerlatency -A 5 -B 5

5. Validate medium component values, hitting error here

Actual results:

% oc get KubeControllerManager -o yaml | grep -A 1 node-monitor
        node-monitor-grace-period:
        - 2m0s
prubenda@prubenda1-mac lrc % oc get KubeAPIServer -o yaml | grep -A 1 default-
        default-not-ready-toleration-seconds:
        - "60"
        default-unreachable-toleration-seconds:
        - "60"
sh-5.1# cat /etc/kubernetes/kubelet.conf | grep nodeStatusUpdateFrequency 
  "nodeStatusUpdateFrequency": "1m0s",

Expected results:

$ oc debug node/<worker-node-name>
$ chroot /host 
$ sh-4.4# cat /etc/kubernetes/kubelet.conf | grep nodeStatusUpdateFrequency 
  "nodeStatusUpdateFrequency": "20s",
$ oc get KubeControllerManager -o yaml | grep -A 1 node-monitor
        node-monitor-grace-period:
        - 2m0s
$ oc get KubeAPIServer -o yaml | grep -A 1 default-
        default-not-ready-toleration-seconds:
        - "60"
        default-unreachable-toleration-seconds:
        - "60"

Additional info:

In the documentation it states that workers will go disabled while the change is being applied and I never saw that occur

links to

openshift/machine-config-operator#3846: OCPBUGS-17433: Sync featuregate controller during the node config controller sync

Assignee:: Sai Ramesh Vanka

Reporter:: Paige Patton

QA Contact:: Sunil Choudhary

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Created:: 2023/08/07 7:02 PM

Updated:: 2023/11/08 8:31 PM

Resolved:: 2023/08/17 3:55 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates