Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Major
Fix Version/s: None
Affects Version/s: rhel-8.10
Component/s: mod_http2
Labels:
None

Regression:
No
Severity:
None

Pool Team:

rhel-sst-cs-stacks
Sub-System Group:

ssg_core_services

Story Points:
None
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Product Documentation Required:
None
Sprint:
None

Preliminary Testing:
None
Test Coverage:
None

Experience:
Architecture:

x86_64

PX Impact Score:
PX Technical Impact:
PX Impact Range:
PX Priority Data:
PX Review Complete:
SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Planning:
None

What were you trying to do that didn't work?

If a client aborts 1 request on an http2 stream, this triggers some DoS protections in http2 as the reset on the session is handled here:
https://github.com/icing/mod_h2/blob/v1.15.19/mod_http2/h2_session.c#L402
https://github.com/icing/mod_h2/blob/v1.15.19/mod_http2/h2_mplx.c#L1143
And this calls m_be_annoyed:
https://github.com/icing/mod_h2/blob/v1.15.19/mod_http2/h2_mplx.c#L987
The first time this occurs, that drops the max tasks allowed on this stream to a max of 16 and lower for subsequent invocations. It then also unschedules any current tasks over that lowered limit, which then produces failures in any requests unscheduled as a result (browser throws ERR_HTTP2_PROTOCOL_ERROR). The unscheduled tasks following the reset are shown with http2 trace logging like below:

[Tue Sep 10 17:05:20.314728 2024] [http2:debug] [pid 1768919:tid 1769073] h2_session.c(347): [client 127.0.0.1:57224] AH03066: h2_session(170,BUSY,20): recv FRAME[RST_STREAM[length=4, flags=0, stream=351]], frames=248/2051 (r/s)
[Tue Sep 10 17:05:20.314735 2024] [http2:debug] [pid 1768919:tid 1769073] h2_session.c(396): [client 127.0.0.1:57224] AH03067: h2_stream(170-351): RST_STREAM by client, error=8
[Tue Sep 10 17:05:20.314742 2024] [http2:trace1] [pid 1768919:tid 1769073] h2_stream.c(302): [client 127.0.0.1:57224] h2_stream(170-351,HALF_CLOSED_REMOTE): transit to [CLOSED]
[Tue Sep 10 17:05:20.314748 2024] [http2:trace1] [pid 1768919:tid 1769073] h2_stream.c(211): [client 127.0.0.1:57224] h2_stream(170-351,CLOSED): closing input
[Tue Sep 10 17:05:20.314755 2024] [http2:trace2] [pid 1768919:tid 1769073] h2_session.c(1976): [client 127.0.0.1:57224] h2_stream(170-351,CLOSED): entered state
[Tue Sep 10 17:05:20.314764 2024] [http2:trace1] [pid 1768919:tid 1769073] h2_mplx.c(1002): [client 127.0.0.1:57224] h2_mplx(170): mood update, decreasing worker limit to 16
[Tue Sep 10 17:05:20.314800 2024] [http2:trace2] [pid 1768919:tid 1769073] h2_mplx.c(949): [client 127.0.0.1:57224] h2_mplx(184-379): unschedule, resetting task for redo later
[Tue Sep 10 17:05:20.314810 2024] [http2:trace2] [pid 1768919:tid 1769073] h2_mplx.c(949): [client 127.0.0.1:57224] h2_mplx(184-377): unschedule, resetting task for redo later
[Tue Sep 10 17:05:20.314820 2024] [http2:trace2] [pid 1768919:tid 1769073] h2_mplx.c(949): [client 127.0.0.1:57224] h2_mplx(184-375): unschedule, resetting task for redo later

We need the following fix backport that removes that task unscheduling:

https://github.com/apache/httpd/pull/317/commits/ae6dedd0acc246a4ce6be2fa8fc64b7d1851bdae#diff-3e0956e7d573ccb745c0377b22eb6b15d6599c14cd04643b99837d4b946adac2

What is the impact of this issue to you?

Seemingly random HTTP/2 failures

Please provide the package NVR for which the bug is seen:

1.15.7-10.module+el8.10.0+21653+eaff63f0

How reproducible is this bug?:

Consistently in some high concurrent request traffic involving a client reset

Expected results

No other request failures

Actual results

Other concurrent requests can fail with a client's reset

Assignee:: Lubos Uhliarik

Reporter:: Aaron Ogburn

Developer:: Lubos Uhliarik

QA Contact:: rhel-cs-infra-services-qe rhel-cs-infra-services-qe

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2024/09/16 2:25 PM

Updated:: 2024/11/20 10:20 AM

Details

Description

What were you trying to do that didn't work?

What is the impact of this issue to you?

Please provide the package NVR for which the bug is seen:

How reproducible is this bug?:

Expected results

Actual results

Attachments

Easy Agile Planning Poker

Activity

People

Dates