-
Bug
-
Resolution: Done-Errata
-
Normal
-
rhel-9.5
-
NetworkManager-1.47.5-1.el9
-
None
-
Moderate
-
1
-
rhel-sst-network-management
-
ssg_networking
-
10
-
2
-
False
-
-
No
-
NMT - RHEL-9.5 DTM 8
-
-
Pass
-
None
-
-
x86_64
-
None
What were you trying to do that didn't work?
On a prow job such as https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-master-ci-4.16-e2e-gcp-ovn-upgrade/1770310694600708096 we are seeing:
: Node process segfaulted expand_less 0s
{ nodes/ci-op-8zbhh82n-f9945-lvzfz-master-0/journal-previous.gz:Mar 20 06:14:07.554815 ci-op-8zbhh82n-f9945-lvzfz-master-0 kernel: NetworkManager[1192]: segfault at 1 ip 00005617e33ec719 sp 00007ffe03abdc70 error 4 in NetworkManager[5617e32ef000+273000] likely on CPU 5 (core 2, socket 0)
upon node reboot.
node logs are with that prow job's artifacts: https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/logs/periodic-ci-openshift-release-master-ci-4.16-e2e-gcp-ovn-upgrade/1770310694600708096/artifacts/e2e-gcp-ovn-upgrade/gather-extra/artifacts/nodes/
Sample log for master-0:
Mar 20 06:14:07.535458 ci-op-8zbhh82n-f9945-lvzfz-master-0 systemd[1]: Started machine-config-daemon: Node will reboot into config rendered-master-30de036365d23d0bfd70e28276592c9c.
Mar 20 06:14:07.539590 ci-op-8zbhh82n-f9945-lvzfz-master-0 root[322940]: machine-config-daemon[308499]: reboot successful
Mar 20 06:14:07.549869 ci-op-8zbhh82n-f9945-lvzfz-master-0 systemd-logind[980]: The system will reboot now!
Mar 20 06:14:07.553685 ci-op-8zbhh82n-f9945-lvzfz-master-0 systemd-logind[980]: System is rebooting.
Mar 20 06:14:07.554815 ci-op-8zbhh82n-f9945-lvzfz-master-0 kernel: NetworkManager[1192]: segfault at 1 ip 00005617e33ec719 sp 00007ffe03abdc70 error 4 in NetworkManager[5617e32ef000+273000] likely on CPU 5 (core 2, socket 0)
Mar 20 06:14:07.554905 ci-op-8zbhh82n-f9945-lvzfz-master-0 kernel: Code: a1 24 00 5b 48 89 ef 5d 41 5c 48 8b 40 30 ff e0 90 f3 0f 1e fa 55 48 85 f6 0f 84 82 00 00 00 48 89 f5 e8 5a b4 f1 ff 48 89 c6 <48> 8b 45 00 48 85 c0 74 05 48 3b 30 74 0c 48 89 ef e8 a1 8f f0 ff
Mar 20 06:14:07.588310 ci-op-8zbhh82n-f9945-lvzfz-master-0 systemd[1]: machine-config-daemon-reboot.service: Deactivated successfully.
Mar 20 06:14:07.588624 ci-op-8zbhh82n-f9945-lvzfz-master-0 systemd[1]: Stopped machine-config-daemon: Node will reboot into config rendered-master-30de036365d23d0bfd70e28276592c9c.
Mar 20 06:14:07.590597 ci-op-8zbhh82n-f9945-lvzfz-master-0 systemd[1]: Stopping crio-conmon-7f350033cb3265beff7c7bb193639a4d1efb9da9a8dd1f8bd5d6bfcae3124ef5.scope...
Please provide the package NVR for which bug is seen:
NetworkManager-1-1.47.2-1.el9-x86_64
How reproducible:
Since that is 1 of 8 jobs that ran at the same time and all 8 jobs saw the problem happen where all nodes (6 each) all got the problem, it seems reproducible.
Steps to reproduce
- this appears to be happening during an upgrade (as I see this is the second reboot in that node's log
Expected results
No segFault.
Actual results
segFault. I noticed unfortunately that there are no core dumps.
since the segFault happens at boot it seems inconsequential but our test looks for segFaults so it fails.
- links to
-
RHBA-2024:129004 NetworkManager update