Loading...

XML

Word

Printable

Type: Bug
Resolution: Duplicate
Priority: Normal
Fix Version/s: None
Affects Version/s: rhel-9.4
Component/s: dnf
Labels:
None

Regression:
None
Severity:
Important

Pool Team:

rhel-sst-cs-software-management
Sub-System Group:

ssg_core_services

Story Points:
None
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Product Documentation Required:
None
Products:

Red Hat Enterprise Linux
Sprint:
None

Preliminary Testing:
None
Test Coverage:
None

Experience:

PX Impact Score:
PX Priority Data:
SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Planning:
None

This is a continuation of RHEL-35656 but on the dnf side.

What were you trying to do that didn't work?

A customer uses dnf -y update >redirect.out; echo $? command to perform updates automatically and get result.
When the update is "long" (i.e. the connection to his satellite server is idle for a longer time than KeepAlive of 15 seconds), the customer sees the connection to the satellite server be closed on timeout and this generates a SIGPIPE internally, as seen in the strace excerpt below:

1229597 14:32:41.789160 write(7</var/log/rhsm/rhsm.log>, "2024-05-13 14:32:41,788 [DEBUG] dnf:1229597:MainThread @connection.py:676 - Closing HTTPS connection <ssl.SSLSocket fd=8, family"..., 222) = 222 <0.000007>
1229597 14:32:41.790210 write(8<TCP:[10.132.72.128:48858->172.29.73.11:443]>, "\27\3\3\0\23\206\220\273\216\1\344\336\320{A\1\259]\16u'\304\16", 24) = -1 EPIPE (Broken pipe) <0.000012>
1229597 14:32:41.790267 --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=1229597, si_uid=0} ---
1229597 14:32:41.790284 rt_sigreturn({mask=[]}) = -1 EPIPE (Broken pipe) <0.000005>
 :
1229597 14:32:44.119261 close(5</var/log/dnf.librepo.log>) = 0 <0.000009>
1229597 14:32:44.267278 exit_group(141) = ?
1229597 14:32:44.297930 +++ exited with 141 +++

Here above we can see the connection to satellite being closed ("unwrap" called), which fails in EPIPE because the satellite server already closed the connection. This then generates a SIGPIPE and code continues.
dnf logs the update, then exits with 141, which doesn't make sense and is the issue filed here. 141 is 128 + 13, which means "exit due to signal SIGPIPE".

I'm not able to reproduce the connection to satellite ending in SIGPIPE, I spent days on this already. But clearly, along with customer's help, we know this leads to the issue.

Important details:

The issue doesn't happen when using no redirection for the command (e.g. dnf -y update; echo $?), even though the SIGPIPE signal is already received, as shown in the strace excerpt below:

1203775 09:40:09.394516 write(7</var/log/rhsm/rhsm.log>, "2024-05-07 09:40:09,394 [DEBUG] yum:1203775:MainThread @connection.py:672 - Closing HTTPS connection <ssl.SSLSocket fd=8, family"..., 222) = 222 <0.000008>
1203775 09:40:09.396743 futex(0x7f68e9740fec, FUTEX_WAKE_PRIVATE, 1) = 1 <0.000010>
1203775 09:40:09.396810 write(8<TCP:[10.132.72.128:51940->172.29.73.11:443]>, "\27\3\3\0\23z\237N\265\276kq\221\252\36\235\314\211v\fb\262\271\v", 24) = -1 EPIPE (Broken pipe) <0.000009>
1203775 09:40:09.396850 --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=1203775, si_uid=0} ---
 :
1203775 09:40:14.282206 close(5</var/log/dnf.librepo.log>) = 0 <0.000010>
1203775 09:40:14.373737 exit_group(0)   = ?
1203775 09:40:14.388680 +++ exited with 0 +++

The reason for difference is unclear.
However, along with the customer, we could find out that the difference was due to a modification of signal handling being done when there is no tty (case of command being redirected), see in dnf/i18n.py:

102     if not stdout.isatty():
103         signal.signal(signal.SIGPIPE, signal.SIG_DFL)

Commenting out both lines makes the dnf -y update >redirected.out; echo $? command return expected exit code 0.

Please provide the package NVR for which bug is seen:

dnf-4.14.0-8.el9.noarch

How reproducible:

Always on customer site, wasn't able to reproduce internally at all after spending several days on this.

duplicates

RHEL-35656 "dnf update" fails with EPIPE at end of update when RHSM executes "Updating profile information"

In Progress

Assignee:: packaging-team-maint

Reporter:: Renaud Métrich

Developer:: packaging-team-maint

QA Contact:: Software Management QE

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2024/05/15 8:29 AM

Updated:: 2024/09/23 3:56 PM

Resolved:: 2024/06/06 8:14 AM

Details

Description

What were you trying to do that didn't work?

Please provide the package NVR for which bug is seen:

How reproducible:

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates