-
Bug
-
Resolution: Done-Errata
-
Normal
-
4.13, 4.12.z
-
-
-
Low
-
No
-
ShiftStack Sprint 249, ShiftStack Sprint 250, ShiftStack Sprint 251, ShiftStack Sprint 252, ShiftStack Sprint 253
-
5
-
False
-
-
-
Bug Fix
-
Done
Description of problem:
While deploying a cluster with OVNKubnernetes or applying a cloud-provider-config change, all OCP nodes got a failing unit on them:
$ oc debug -q node/ostest-h9vbm-master-0 -- chroot /host sudo systemctl list-units --failed UNIT LOAD ACTIVE SUB DESCRIPTION ● afterburn-hostname.service loaded failed failed Afterburn HostnameLOAD = Reflects whether the unit definition was properly loaded. ACTIVE = The high-level unit activation state, i.e. generalization of SUB. SUB = The low-level unit activation state, values depend on unit type. 1 loaded units listed. $ oc debug -q node/ostest-h9vbm-master-0 -- chroot /host sudo systemctl status afterburn-hostname × afterburn-hostname.service - Afterburn Hostname Loaded: loaded (/etc/systemd/system/afterburn-hostname.service; enabled; preset: disabled) Active: failed (Result: exit-code) since Tue 2023-04-18 11:48:35 UTC; 2h 26min ago Main PID: 1309 (code=exited, status=123) CPU: 148msApr 18 11:48:35 ostest-h9vbm-master-0 openstack-afterburn-hostname[1314]: 1: maximum number of retries (10) reached Apr 18 11:48:35 ostest-h9vbm-master-0 openstack-afterburn-hostname[1314]: 2: failed to fetch Apr 18 11:48:35 ostest-h9vbm-master-0 openstack-afterburn-hostname[1314]: 3: error sending request for url (http://169.254.169.254/latest/meta-data/hostname): error trying to connect: tcp connect error: Network is unreachable (os error 101) Apr 18 11:48:35 ostest-h9vbm-master-0 openstack-afterburn-hostname[1314]: 4: error trying to connect: tcp connect error: Network is unreachable (os error 101) Apr 18 11:48:35 ostest-h9vbm-master-0 openstack-afterburn-hostname[1314]: 5: tcp connect error: Network is unreachable (os error 101) Apr 18 11:48:35 ostest-h9vbm-master-0 openstack-afterburn-hostname[1314]: 6: Network is unreachable (os error 101) Apr 18 11:48:35 ostest-h9vbm-master-0 hostnamectl[2494]: Too few arguments. Apr 18 11:48:35 ostest-h9vbm-master-0 systemd[1]: afterburn-hostname.service: Main process exited, code=exited, status=123/n/a Apr 18 11:48:35 ostest-h9vbm-master-0 systemd[1]: afterburn-hostname.service: Failed with result 'exit-code'. Apr 18 11:48:35 ostest-h9vbm-master-0 systemd[1]: Failed to start Afterburn Hostname. $ oc debug -q node/ostest-h9vbm-worker-0-fkxdr -- chroot /host sudo systemctl list-units --failed UNIT LOAD ACTIVE SUB DESCRIPTION ● afterburn-hostname.service loaded failed failed Afterburn HostnameLOAD = Reflects whether the unit definition was properly loaded. ACTIVE = The high-level unit activation state, i.e. generalization of SUB. SUB = The low-level unit activation state, values depend on unit type. 1 loaded units listed.
Once the installation of the config change is done, restarting the service resolves the issue:
$ oc debug -q node/ostest-h9vbm-worker-0-fkxdr -- chroot /host sudo systemctl restart afterburn-hostname $ oc debug -q node/ostest-h9vbm-worker-0-fkxdr -- chroot /host sudo systemctl status afterburn-hostname ○ afterburn-hostname.service - Afterburn Hostname Loaded: loaded (/etc/systemd/system/afterburn-hostname.service; enabled; preset: disabled) Active: inactive (dead) since Tue 2023-04-18 14:14:40 UTC; 9s ago Process: 171875 ExecStart=/usr/local/bin/openstack-afterburn-hostname (code=exited, status=0/SUCCESS) Main PID: 171875 (code=exited, status=0/SUCCESS) CPU: 119msApr 18 14:14:32 ostest-h9vbm-worker-0-fkxdr systemd[1]: Starting Afterburn Hostname... Apr 18 14:14:39 ostest-h9vbm-worker-0-fkxdr openstack-afterburn-hostname[171876]: Apr 18 14:14:39.521 WARN failed to locate config-drive, using the metadata service API instead Apr 18 14:14:39 ostest-h9vbm-worker-0-fkxdr openstack-afterburn-hostname[171876]: Apr 18 14:14:39.583 INFO Fetching http://169.254.169.254/latest/meta-data/hostname: Attempt #1 Apr 18 14:14:40 ostest-h9vbm-worker-0-fkxdr openstack-afterburn-hostname[171876]: Apr 18 14:14:40.237 INFO Fetch successful Apr 18 14:14:40 ostest-h9vbm-worker-0-fkxdr openstack-afterburn-hostname[171876]: Apr 18 14:14:40.237 INFO wrote hostname ostest-h9vbm-worker-0-fkxdr to /dev/stdout Apr 18 14:14:40 ostest-h9vbm-worker-0-fkxdr systemd[1]: afterburn-hostname.service: Deactivated successfully. Apr 18 14:14:40 ostest-h9vbm-worker-0-fkxdr systemd[1]: Finished Afterburn Hostname. error: non-zero exit code from debug container [stack@undercloud-0 ~]$ oc debug -q node/ostest-h9vbm-master-0 -- chroot /host sudo systemctl status afterburn-hostname × afterburn-hostname.service - Afterburn Hostname Loaded: loaded (/etc/systemd/system/afterburn-hostname.service; enabled; preset: disabled) Active: failed (Result: exit-code) since Tue 2023-04-18 11:48:35 UTC; 2h 26min ago Main PID: 1309 (code=exited, status=123) CPU: 148ms
Version-Release number of selected component (if applicable):
Observed on 4.13.0-0.nightly-2023-04-13-171034 and 4.12.13
How reproducible:
Always
Additional info:
More retries or expanding them in time can help resolve this. It seems that in OVN-K the network is taking time to get ready and therefore the retries are timed out with the current configuration before the network is ready. Must-gather link provided on private comment.
- links to
-
RHEA-2024:0041 OpenShift Container Platform 4.16.z bug fix update