We have discovered that in RHEL9 order of systemd dependencies for configure-ovs or nodeip-configuration or something similar have changed. It seems like now those run before systemd-user-sessions.service and as a consequence if the former fails when the machine starts, we cannot login to the RHCOS (at all)
The outline of what happens is more or less
- early in the process systemd-tmp* creates a lock file saying "only root user can login"
- late in the process systemd-user-sessions.service runs and is responsible for allowing anyone to login (namely, "core" user)
- if anything goes wrong and systemd-user-sessions.service doesn't start, then only "root" can get SSH access to the machine
Till now we never observed the issue even if configure-ovs wasn't healthy. However in 4.13 something has changed and as long as configure-ovs is not finished successfuly, we cannot do `ssh core@<node>`. Given that we don't allow root access, in those scenarios we are locked out from performing any investigation.
In the particular scenario I was debugging I had nodeip-configuration.service failing because it was unable to detect Node IP from the VIPs correctly. It was trying to select an empty IP as a Node IP, thus returning non-zero exit code. The network was up as I could ping and reach SSH port (machine had multiple NICs to make it effectively impossible to lose the network), but as I could never SSH as core and root user is locked, I was not able to collect any logs.
- is cloned by
-
OCPBUGS-14357 [4.13] configure-ovs blocks ssh access to the node when unhealthy
- Closed
- is depended on by
-
OCPBUGS-11388 Hang after reboot with IPv6 with "Populates resolv.conf according to on-prem IPI needs"
- Closed
-
OCPBUGS-14357 [4.13] configure-ovs blocks ssh access to the node when unhealthy
- Closed
- is duplicated by
-
OCPBUGS-11411 Unknown issue with systemd-pcrphase.service blocks SSH access to the system
- Closed
- is related to
-
OCPBUGS-11411 Unknown issue with systemd-pcrphase.service blocks SSH access to the system
- Closed
- links to
-
RHEA-2023:5006 rpm