-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
rhel-8.4.0, rhel-8.8.0.z, rhel-8.10.z
-
Yes
-
Critical
-
rhel-sst-pt-libraries
-
ssg_platform_tools
-
3
-
False
-
-
Yes
-
None
-
None
-
None
-
Bug Fix
-
-
Proposed
-
-
Unspecified
-
None
+++ This bug was initially created as a clone of Bug #1889892 +++
Description of problem:
This bug was submitted by Qin Li to glibc bugzilla earlier this year, with a one-line patch, though it hasn't been merged into glibc yet:
https://sourceware.org/bugzilla/show_bug.cgi?id=25847
Version-Release number of selected component: glibc-2.27 onwards
How reproducible: reliably, try the repro from the sourceware url above
Actual results: deadlocks after 30-120 minutes on a 4-core Fedora 32 box
Expected results: should never deadlock
Additional info:
This bug in pthread conditions will deadlock the OCaml runtime, as well as Python and .NET applications.
The bug was introduced in glibc 2.27 and is still present in glibc 2.31.
I confirm the repro from the above deadlocks on Fedora 32. Takes about 30-180 minutes on a 4 core server.
I further confirm that the one-line fix to glibc at the above applies cleanly to Fedora 32's glibc source rpm, and does not deadlock after running the repro for more than 30 hours.
Please kindly consider merging the one-line fix into Fedora glibc.
More background about this bug, for the sake of future internet searchers:
- https://discuss.ocaml.org/t/is-there-a-known-recent-linux-locking-bug-that-affects-the-ocaml-runtime
— Additional comment from Michael Bacarella on 2020-10-20 20:34:52 UTC —
will deadlock
— Additional comment from Michael Bacarella on 2020-10-20 20:35:47 UTC —
— Additional comment from Carlos O'Donell on 2020-10-27 13:21:56 UTC —
We are looking to fix this for Fedora and Red Hat Enterprise Linux 8 as this has impact to users on both platforms.
— Additional comment from Török Edwin on 2020-11-01 17:59:31 UTC —
Small modification to upstream testcase that abort()s when the loop is stuck for several iterations.
— Additional comment from Carlos O'Donell on 2020-11-10 14:25:18 UTC —
Delaying the review of this until the end of November when we have more time to review upstream patches.
— Additional comment from Fedora Program Management on 2021-04-29 17:06:51 UTC —
This message is a reminder that Fedora 32 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 32 on 2021-05-25.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '32'.
Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.
Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 32 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.
Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.
— Additional comment from Carlos O'Donell on 2021-04-29 20:14:38 UTC —
Still a bug, and still in Rawhide.
- depends on
-
RHEL-30351 glibc: Improve testing coverage of POSIX Thread conditional variable [upstream]
- In Progress
- relates to
-
RHEL-2419 glibc: pthread_cond_wait missed wakeup (swbz#25847) [rhel-9]
- Planning
- external trackers
- mentioned on