-
Bug
-
Resolution: Can't Do
-
Major
-
None
-
4.15.z
-
Quality / Stability / Reliability
-
False
-
-
None
-
Important
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
A worker node went to NotReady state. The kubelet and CRI-O services were inactive, and the nodeip-configuration service was in a failed state due to syntax error. There is already a known issue for OpenShift 4.7 related to this bug (https://bugzilla.redhat.com/show_bug.cgi?id=1894477), but now customer is observed the similar issue in OpenShift 4.15 This issue is still occurring. Despite attempts to restart both the kubelet and CRI-O services, they remained stuck. A reboot of the node did not resolve the issue. After further investigation, found that executing the script "./usr/local/bin/configure-ip-forwarding.sh" manually led to the kubelet and CRI-O services becoming active, and the node then transitioned into a Ready state. Upon checking `cat /etc/systemd/system/nodeip-configuration.service`, I noted that the root cause could be linked to the failure of the nodeip-configuration service and potential issues with IP forwarding setup.
Version-Release number of selected component (if applicable):
4.15.43
Actual results:
The worker node remains in the NotReady state, and the kubelet and CRI-O services are inactive. The nodeip-configuration service fails, and manually executing configure-ip-forwarding.sh resolved the issue.
Expected results:
The worker node should automatically go to Ready state without manual intervention, and the kubelet and CRI-O services should be active upon boot without requiring a manual run of configure-ip-forwarding.sh.
Additional info:
Node IP configuration service (nodeip-configuration) fails to start, causing kubelet and CRI-O services to be inactive. ~~~ Mar 27 09:34:11 li1vchdcpwrk6p.qnb.bnk systemd[1]: Starting Writes IP address configuration so that kubelet and crio services select a valid node IP... Mar 27 09:34:11 li1vchdcpwrk6p.qnb.bnk bash[14737]: /bin/bash: -c: line 1: syntax error near unexpected token `done' Mar 27 09:34:11 li1vchdcpwrk6p.qnb.bnk bash[14737]: /bin/bash: -c: line 1: ` until /usr/bin/podman run --rm --authfile /var/lib/kubelet/config.json --net=host --security-opt label=disable --volume /etc/systemd/system:/etc/systemd/system --volume /run/nodeip-configuration:/run/nodeip-configuration quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d069765e098835ae1a98164c70425c17c3995c28873bf5e292889a100caf3ea5 node-ip set --platform VSphere --user-managed-lb --retry-on-failure do sleep 5; done' Mar 27 09:34:11 li1vchdcpwrk6p.qnb.bnk systemd[1]: nodeip-configuration.service: Main process exited, code=exited, status=2/INVALIDARGUMENT Mar 27 09:34:11 li1vchdcpwrk6p.qnb.bnk systemd[1]: nodeip-configuration.service: Failed with result 'exit-code'. Mar 27 09:34:11 li1vchdcpwrk6p.qnb.bnk systemd[1]: Failed to start Writes IP address configuration so that kubelet and crio services select a valid node IP. ~~~ The issue has been noted in OpenShift 4.7 as well (see https://bugzilla.redhat.com/show_bug.cgi?id=1894477). Executing ./usr/local/bin/configure-ip-forwarding.sh manually resolves the issue temporarily. ~~~ cat etc/systemd/system/nodeip-configuration.service sleep 5; \ done" ExecStart=/bin/systemctl daemon-reload ExecStartPre=/bin/mkdir -p /run/nodeip-configuration ExecStartPost=+/usr/local/bin/configure-ip-forwarding.sh StandardOutput=journal+console StandardError=journal+console ~~