Environment:
Baremetal MNO + ODF
Description of problem:
customer is testing env upgrades:
- from 4.16.44 > 4.17.37 > 4.18.22 they found no issues at all
- from 4.16.44 > 4.17.37 > 4.18.24 they're experiencing problems 50% of times related to kubelet failing to starts on a master node, possibly due to ocp-tuned-one-shot.service repeatedly failed
Oct 03 11:45:04 master2 systemd[1]: ocp-tuned-one-shot.service: Failed with result 'exit-code'.
Oct 03 11:45:04 master2 systemd[1]: Failed to start TuneD service from NTO image.
This looks close to the scenario described in KCS#7128296 (unfortunately without any linked case or BUG).
How reproducible:
50% of time
Steps to Reproduce:
upgrade from 4.16.44 > 4.17.37 > 4.18.24
Actual results:
50% of times, kubelet fails to start on master nodes
- blocks
-
OCPBUGS-63334 kubelet fails to start on master node during upgrade
-
- Closed
-
- is cloned by
-
OCPBUGS-63334 kubelet fails to start on master node during upgrade
-
- Closed
-
- is related to
-
OCPBUGS-63244 PerformanceProfile config is generating extra rendered-config after cluster is upgraded
-
- New
-
-
OCPBUGS-59958 Autosizing causes control plane node to reboot twice during upgrade
-
- ASSIGNED
-
- links to
(2 links to)