-
Bug
-
Resolution: Done-Errata
-
Undefined
-
rhel-8.10
-
None
-
lftp-4.8.4-4.el8_10
-
Yes
-
Moderate
-
1
-
rhel-net-perf
-
ssg_core_services
-
14
-
5
-
False
-
False
-
-
None
-
N&P-25_0
-
Pass
-
Automated
-
Unspecified
-
Unspecified
-
Unspecified
-
-
x86_64
-
None
Customer is seeing unexpected failure, and we've yet to positively identify
what component is failing, but the failure appears when the customer
connects to a TLS-protected vsftp server to upload a file with either lftp
or curl. Customer is seeing this with RHEL 8 on both sides of the connection.
I've been able to reproduce what I suspect is the same situation with a
RHEL 9 client and a RHEL 8 server. Success matrix shared below.
The customer initially thought it might be the data they were trying to
send, and they isolated a test case that reliably fails. However, in
looking at the issue and consulting with my peers, I wondered if perhaps it
wasn't the data but how much data there was, and that seems to be the case.
Here's a test with a random file of the same size as what the customer
sends:
[root@rhel9 ~]# dd if=/dev/urandom of=mockscrubbed bs=44087 count=1
1+0 records in
1+0 records out
44087 bytes (44 kB, 43 KiB) copied, 0.000928356 s, 47.5 MB/s
[root@rhel9 ~]# lftp -e "put mockscrubbed; quit" -u 'testing,testing' rhel8
55406 bytes transferred in 15 seconds (3.6 KiB/s)
[root@rhel9 ~]# lftp -e "put mockscrubbed; quit" -u 'testing,testing' rhel8
44087 bytes transferred
[root@rhel9 ~]# lftp -e "put mockscrubbed; quit" -u 'testing,testing' rhel8
55406 bytes transferred in 15 seconds (3.6 KiB/s)
[root@rhel9 ~]# lftp -e "put mockscrubbed; quit" -u 'testing,testing' rhel8
`mockscrubbed' at 44087 (100%) 4b/s eta:0s [Delaying before reconnect: 7]
Note that some attempts work the first time - those that show the expected
44087 bytes transferred - but some reconnect after an initial failure.
A test with curl that didn't fail:
[root@rhel9 ~]# curl --ftp-ssl --ssl-reqd -T mockscrubbed ftp://testing:testing@rhel8: ( notsecret )
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 44087 0 0 100 44087 0 742k -::- -::- -::- 742k
I can't reproduce the exact error the customer sees, insofar as I can't get
curl to fail on RHEL at all, and I can only get lftp to fail from RHEL 9
(client) to RHEL 8 (server).
Debian 12 curl fails:
$ curl --ftp-ssl --ssl-reqd -T mockscrubbed ftp://testing:testing@rhel8: ( notsecret )
Warning: --ssl is an insecure option, consider --ssl-reqd instead
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 44087 0 0 100 44087 0 205k -::- -::- -::- 205k
curl: (18) server did not report OK, got 426
In fact, a small grid of successes and failures, all these using lftp:
Client -> Server, results
RHEL 7 -> RHEL 7, no issues
RHEL 8 -> RHEL 7, no issues
RHEL 8 -> RHEL 7, no issues
Debian 12 -> RHEL 7, no issues
Ubuntu 24.04 -> RHEL 7, no issues
RHEL 7 -> RHEL 8, no issues
RHEL 8 -> RHEL 8, no issues
RHEL 9 -> RHEL 8, many retries, eventual success if left long enough
Debian 12 -> RHEL 8, many retries, eventual success if left long enough
Ubuntu 24.04 -> RHEL 8, many retries, eventual success if left long enough
RHEL 7 -> RHEL 9, no issues
RHEL 8 -> RHEL 9, no issues
RHEL 9 -> RHEL 9, no issues
Debian 12 -> RHEL 9, many retries, eventual success if left long enough
Ubuntu 24.04 -> RHEL 9, no issues
That we see errors across platforms and tools makes me want to think the
issue lives with either the OpenSSL client libraries, or with vsftpd or the
OpenSSL libraries it uses. vsftpd on RHEL 8 is the only common ground that
fails reliably. sbroz noted an issue with lftp that sounded awfully similar
to me, especially given that file size matters and we're seeing this with
small files, but if this is the issue, then a distinct but equivalent fix
would be needed for curl:
https://github.com/lavv17/lftp/pull/596
I'm going to test that locally soon, but I haven't had a chance as yet.
File sizes matter! Once I noticed that I didn't need the client data, just
a file of equivalent size, I started trying to find the range of sizes that
fail. A quick binary search shows the floor to be 43276 bytes (seems to
generally work) and 43277 bytes (seems to generally fail) although I do not
trust these numbers to be consistent over time. On the larger end, 110,000
bytes fails, but less consistently - possibly 5% of the time, eyeballed. As
sizes decrease the frequency of failure increases, and as it increases,
failures are less frequent. For instance, I've only seen a single failure
with a file size of 150,000 bytes.
Where exactly this bug is has yet to be isolated, but it's impacting a
customer workload significantly, so we'd like to ask Engineering to explore
it in parallel to Support Delivery.
- account is impacted by
-
RHEL-99571 RHEL-88955 fix caused evident regression
-
- Closed
-
- links to
-
RHBA-2025:149838 lftp update