-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
rhel-9.4
-
No
-
Important
-
1
-
rhel-sst-network-management
-
ssg_networking
-
None
-
-
False
-
-
None
-
Red Hat Enterprise Linux
-
NMT SST - Future releases
-
-
None
-
None
-
-
x86_64
-
None
Issue :
ModemManager invalidates the Sierra MC7304 modem. When large file is downloaded from the board computer, the modem on board computer is invalidated after 10 QMI timeouts. Issue is not seen during upload.
ModemManager[1399]: <error> [modem0] port cdc-wdm3 timed out 10 consecutive times, marking modem as invalid
After the modem is marked invalid, the bus is removed.
$ mmcli -m 0 error: couldn't find modem
ModemManager[1398]: <error> [modem0] port cdc-wdm3 timed out 10 consecutive times, marking modem as invalid
ModemManager[1398]: <debug> [/dev/cdc-wdm3] number of consecutive timeouts: 10
ModemManager[1398]: <debug> [/dev/cdc-wdm3] transaction 0x29 aborted, but message is not abortable
ModemManager[1398]: <warn> [modem0/bearer3] reloading stats failed: QMI operation failed: Transaction timed out
ModemManager[1398]: <debug> [modem0/bearer1] removing from bus
ModemManager[1398]: <debug> [modem0/bearer3] removing from bus
ModemManager[1398]: <debug> [device /sys/devices/pci0000:00/0000:00:14.0/usb2/2-4] unexported modem from path '/org/freedesktop/ModemManager1/Modem/0'
ModemManager[1398]: <debug> [modem0/wwp0s20u4i10/net] port now disconnected
This behavior causes the transfer to disrupt.
Modem information :
Hardware | manufacturer: Sierra Wireless, Incorporated | model: MC7304 | firmware revision: SWI9X15C_06.03.32.02 r26426 CNSHZ-AR-BUILD 2015/01/16 01:32:41 | h/w revision: 1.0 | supported: gsm-umts, lte | current: gsm-umts, lte | equipment id: 356853056551122
Observation & Analysis :
- Issue is seen with latest ModemManager-1.20.2-1.el9.
- Issue is not seen with older versions ModemManager-1.18.2-3.el9.
- Above versions are tested with RHEL8 as well and same pattern is seen.
- The QMI timeout warnings are seen for ModemManager <= 1.18.2, but the transfer completes without issue.
- Other vendor modem Telit LEPCIC4EU13T130H00 doesn't show any issue in ModemManager-1.20.2-1.el9.
To understand how modem behaves when timeout is disabled, i created a patch (attached - 0001-modem-disable-QMI-timeouts.patch, also below) and brewed the rpm with it. The results are good and the bus is not removed anymore, also the transfer completes fine as per customer.
$ cat 0001-modem-disable-QMI-timeouts.patch From 30c7b85b0677f17bcd38325f16c8f5813694e540 Mon Sep 17 00:00:00 2001 From: Abhishek Rawal <arawal@redhat.com> Date: Wed, 11 Sep 2024 23:00:33 +0530 Subject: [PATCH] modem: disable QMI timeouts --- src/mm-broadband-modem-qmi.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mm-broadband-modem-qmi.c b/src/mm-broadband-modem-qmi.c index 98868fa..66b92e9 100644 --- a/src/mm-broadband-modem-qmi.c +++ b/src/mm-broadband-modem-qmi.c @@ -13516,6 +13516,7 @@ mm_broadband_modem_qmi_new (const gchar *device, MM_BASE_MODEM_PLUGIN, plugin, MM_BASE_MODEM_VENDOR_ID, vendor_id, MM_BASE_MODEM_PRODUCT_ID, product_id, + MM_BASE_MODEM_MAX_TIMEOUTS, 0, /* QMI bearer supports NET only */ MM_BASE_MODEM_DATA_NET_SUPPORTED, TRUE, MM_BASE_MODEM_DATA_TTY_SUPPORTED, FALSE, -- 2.46.0
Test rpms can be found at : https://download.eng.bos.redhat.com/brewroot/work/tasks/4707/64144707/
However, we don't think this may be the right solution as we are disabling the timeout for all qmi modems. (also the other modem was working fine).
Impact :
There are 1800 boards computers which uses Sierra modem in question. Customer have modem losses about 2 devices per day. Systems are unable to reliably send telemetry to their backend, which might delay ticket sales, passenger WiFi, and collection of passenger counting data.
Steps to reproduce
Issue is reproduced easily by performing the download by customer.
Attachments :
- sosreport-eddie00849-03911368-2024-08-26-mqtgldd.tar.xz
- ModemManager-test-success.log.xz
- sosreport-MM-1-18.tar.xz
sosreport contains ModemManager debug logs of issue on ModemManager-1.20.2-1.el9. The ModemManager-test-success.log is for the testing on the patched rpm (disable timeout). sosreport-MM-1-18.tar.gz contains the logs for v1.18 of ModemManager.
The patch we supplied works as we have qmi modem's timeouts disabled, however i don't think it's correct solution to it. Requesting your insights and help to provide the right solution/answer to customer.
- Why the modem is timing out at first place ?
- Is there any possibility to expose the tunable at user level in mmcli to tune the timeout retries based on need of different modems ?
- links to