-
Bug
-
Resolution: Done-Errata
-
Normal
-
rhel-8.9.0
-
openssh-8.0p1-25.el8_10
-
None
-
Moderate
-
2
-
rhel-sst-security-crypto
-
ssg_security
-
None
-
False
-
-
None
-
Red Hat Enterprise Linux
-
Crypto24Q2, Crypto24Q3
-
Pass
-
None
-
-
All
-
None
What were you trying to do that didn't work?
When a ssh connection is performed using multiplexing (ControlPath, ControlMaster=auto, ...) and the master connection timed out a few milliseconds before mux_client_hello_exchange() is called, this leads to having ssh die in SIGPIPE, which is a bug.
Indeed, it's a bug because multiplexing is supposed to have a fallback as noted in the ssh_config(5) manpage:
ControlMaster [...] These sessions will try to reuse the master instance's network connection rather than initiating new ones, but will fall back to connecting normally if the control socket does not exist, or is not listening.
The reason for ssh dying with SIGPIPE is at the time mux_client_hello_exchange() and underlying mux_client_write_packet() executes, the SIGPIPE signal is not yet ignored, causing line 1513 to raise the signal and kill ssh:
1488 static int 1489 mux_client_write_packet(int fd, struct sshbuf *m) 1490 { : 1513 len = write(fd, ptr + have, need - have); :
A quick fix is to ignore the SIGPIPE while doing the hello:
Original code:
1575 static int 1576 mux_client_hello_exchange(int fd) 1577 { : 1589 if (mux_client_write_packet(fd, m) != 0) { 1590 debug_f("write packet: %s", strerror(errno)); 1591 goto out; 1592 } :
Modified code:
1575 static int 1576 mux_client_hello_exchange(int fd) 1577 { : 1581 sshsig_t old_sigpipe; : 1590 old_sigpipe = ssh_signal(SIGPIPE, SIG_IGN); 1591 r = mux_client_write_packet(fd, m); 1592 ssh_signal(SIGPIPE, old_sigpipe); 1593 if (r != 0) { 1594 debug_f("write packet: %s", strerror(errno)); 1595 goto out; 1596 } :
EDIT: Upstream fixed that through 96faa0de6c673a2ce84736eba37fc9fb723d9e5c.
Please provide the package NVR for which bug is seen:
openssh-clients-8.7p1-38.el9
openssh-clients-8.0p1-19.el8_9.2
How reproducible:
Often using a quickly closing connection
Steps to reproduce
- Execute the following command in loop
$ while :; do echo; date +%s.%N; /usr/bin/ssh -o ControlPath=/tmp/%r@%h:%p -o ControlPersist=2 -o ControlMaster=auto localhost hostname || { echo Exited with: $? ; break ; }; sleep 1.6s ; done
The sleep delay may be adjusted, depending on the hardware, the idea being the new connection will happen just while master connection times out
Expected results
Connection to the system never failing in exit code 141.
Actual results
1716372217.124957684 vm-ssh9 1716372219.174926179 muxclient: master hello exchange failed vm-ssh9 1716372221.187231796 muxclient: master hello exchange failed vm-ssh9 1716372223.184413163 muxclient: master hello exchange failed vm-ssh9 1716372225.178581753 Exited with: 141
Other reproducer using a systemtap script
- In a terminal start the following script
# stap -g -v -e 'probe process("/usr/bin/ssh").statement("*@mux.c:1513") { raise(%{SIGPIPE%}); exit() }'
- In another terminal execute the following command in loop
$ while :; do echo; date +%s.%N; /usr/bin/ssh -o ControlPath=/tmp/%r@%h:%p -o ControlPersist=20 -o ControlMaster=auto localhost hostname || { echo Exited with: $? ; break ; }; sleep 1.6s ; done
Here above ControlPersist can be larger because we will inject a SIGPIPE directly into the write() location, which shows SIGPIPE is not ignored at the time of the call.
- is cloned by
-
RHEL-37748 ssh with multiplexing can fail to connect to remote system when connection just timed out
- Closed
- links to
-
RHBA-2024:135996 openssh update