-
Bug
-
Resolution: Won't Do
-
None
-
4.11
-
None
-
Quality / Stability / Reliability
-
False
-
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Note:
1. if you are dealing with Machines or MachineSet objects, please select the component as "Cloud Compute" under same product.
2. if you are dealing with kubelet / kubeletconfigs / container runtime configs, please select the component as "Node" under same product.
Description of problem:
After deploying OCP on ppc64le environment, when I run:
[root@rdr-sri-7cfc-tok04-bastion-0 ~]# oc get MachineConfigPool worker
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
worker rendered-worker-627087f67ace055b92ca4c26488fdeb7 False True False 3 2 2 0 18h
I expect to see READYMACHINECOUNT be 3 but it is 2.
Then I stopped kubelet.service on one of the nodes (worker). The count became:
[root@rdr-sri-7cfc-tok04-bastion-0 ~]# oc get MachineConfigPool worker
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
worker rendered-worker-627087f67ace055b92ca4c26488fdeb7 False True False 3 1 2 0 18h
READYMACHINECOUNT became 1. When I started the service back, the COUNT got back to 2.
Version-Release number of MCO (Machine Config Operator) (if applicable): 4.11
Platform (AWS, VSphere, Metal, etc.): IBM Power
Are you certain that the root cause of the issue being reported is the MCO (Machine Config Operator)?
(Y/N/Not sure): Yes as I am looking at MachineConfigPool.
How reproducible:
Did you catch this issue by running a Jenkins job? If yes, please list:
1. Jenkins job:
2. Profile:
Steps to Reproduce:
1. As described above.
2.
3.
Actual results:
Expected results:
Additional info:
1. Please consider attaching a must-gather archive (via oc adm must-gather). Please review must-gather contents for sensitive information before attaching any must-gathers to a Bugzilla report. You may also mark the bug private if you wish.
2. If a must-gather is unavailable, please provide the output of:
$ oc get co machine-config -o yaml
$ oc get mcp (and oc describe mcp/${degraded_pool} if pools are degraded)
$ oc get mc
$ oc get pod -n openshift-machine-config-operator
$ oc get node -o wide
3. If a node is not accessible via API, please provide console/journal/kubelet logs of the problematic node
4. Are there RHEL nodes on the cluster? If yes, please upload the whole Ansible logs or Jenkins job
- external trackers