Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Undefined
Fix Version/s: None
Affects Version/s: rhosdt-3.5
Component/s: Tempo
Labels:
None

Activity Type:
Quality / Stability / Reliability
Story Points:
1
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Epic Link:
Use network policies for OpenShift Operators
Git Pull Request:
https://github.com/grafana/tempo-operator/blob/main/api/tempo/v1alpha1/tempostack_types.go#L679
Intelligence Requested:
Market:

Sprint:
Tracing Sprint # 272 - Release
Severity:
Important

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

Description of problem:

Capturing traces from multiple applications to a single TempoStack instance, then while querying those traces for a longer duration (say 6 hours) with trace count of 20 or more, it results in 504 Gateway Timeout

Version-Release number of selected component (if applicable):

Tempo Operator 0.15.4-1

How reproducible:

100%

Steps to Reproduce:

1. Install and configure multi tentant TempoStack
2. Setup OpenTelemetryCollector to collect the traces.
3. Create multiple namespaces (I created 25)
4. Deploy Instrumentation and Applications in above created namespaces.
5 Wait for sometime and let TempoStack capture some traces (say 3-4 hours)
6. Try to query traces (for 1 service) for 6 or more hours of duration and it results in 504 gateway timeout

Actual results:

504 Gateway Timeout is seen when querying traces for a longer duration and limit results to 20.

Expected results:

Tempo should present the traces without getting timed out.

Additional info:

The test was carried out with AWS S3 as object storage.
Configured 2 vCPU and 5 Gi of memory for querier and query-frontend pods.
Neither of querier and query-frontend pods got OOMKilled.

---
Gateway streams below errors when query fails:

level=warn name=observatorium ts=2025-05-29T19:57:37.387386306Z caller=stdlib.go:105 caller=reverseproxy.go:661 msg="http: proxy error: context deadline exceeded"
level=warn name=observatorium ts=2025-05-29T19:57:37.387439501Z caller=instrumentation.go:33 request=tempo-simplest-gateway-7d4c694d98-l4ft4/lKPOKWFDP3-000164 proto=HTTP/1.1 method=GET status=502 content= path=/api/traces/v1/dev/api/traces duration=30.000367897s bytes=0

---
The issue seems to be either at query-frontend or querier pod side.

relates to

TRACING-5247 Add network policies for RHOSDT Operators

Closed

Assignee:: Pavol Loffay

Reporter:: Dhruv Gautam

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2025/05/29 8:21 PM

Updated:: 2025/09/12 11:40 PM

Resolved:: 2025/06/16 1:59 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates