-
Bug
-
Resolution: Done
-
Normal
-
Logging 5.5.0
-
False
-
None
-
False
-
NEW
-
VERIFIED
-
-
Log Storage - Sprint 223, Log Storage - Sprint 224, Log Storage - Sprint 225
Description:
LokiRequestErrors alert is not firing after taking down querier in LokiStack unmanaged mode. Need to run query/query_range queries to initiate 5xx errors.
Steps to Reproduce:
1) Deploy CLO and Loki Operator
2) Create LokiStack CR and forward logs to gateway
3) Edit the LokiStack and go 'unmanaged'
4) Delete querier deployment (Querier should not be running at this point)
$ oc get deployments lokistack-dev-querier -n openshift-logging
Error from server (NotFound): deployments.apps "lokistack-dev-querier" not found
5) Fire some queries
logcli -o raw --tls-skip-verify --bearer-token="$(oc whoami -t)" --addr "https://lokistack-dev-openshift-logging.apps.kbharti-410-100.qe.devcluster.openshift.com/api/logs/v1/application" query '{log_type="application"}' logcli -o raw --tls-skip-verify --bearer-token="$(oc whoami -t)" --addr "https://lokistack-dev-openshift-logging.apps.kbharti-410-100.qe.devcluster.openshift.com/api/logs/v1/audit" query '{log_type="audit"}' logcli -o raw --tls-skip-verify --bearer-token="$(oc whoami -t)" --addr "https://lokistack-dev-openshift-logging.apps.kbharti-410-100.qe.devcluster.openshift.com/api/logs/v1/infrastructure" query '{log_type="infrastructure"}'
Queries would be responded by timeout error.
Error response from server: <html><body><h1>504 Gateway Time-out</h1>
The server didn't respond in time.
</body></html>
(<nil>) attempts remaining: 0
Query failed: Run out of attempts while querying the server
Logs on Gateway:
level=warn name=lokistack-gateway ts=2022-08-04T17:38:53.784246849Z caller=stdlib.go:105 caller=reverseproxy.go:489 msg="http: proxy error: context canceled" level=warn name=lokistack-gateway ts=2022-08-04T17:38:53.784304735Z caller=instrumentation.go:33 request=lokistack-dev-gateway-84996dbb9-6dxwl/2jSiMJWyj5-010955 proto=HTTP/1.1 method=GET status=502 content= path=/api/logs/v1/infrastructure/loki/api/v1/query_range duration=30.001070673s bytes=0 level=warn name=lokistack-gateway ts=2022-08-04T17:41:27.417207863Z caller=stdlib.go:105 caller=reverseproxy.go:489 msg="http: proxy error: context canceled" level=warn name=lokistack-gateway ts=2022-08-04T17:41:27.41727024Z caller=instrumentation.go:33 request=lokistack-dev-gateway-84996dbb9-6dxwl/2jSiMJWyj5-012258 proto=HTTP/1.1 method=GET status=502 content= path=/api/logs/v1/infrastructure/loki/api/v1/query_range duration=30.001531582s bytes=0 level=warn name=lokistack-gateway ts=2022-08-04T22:28:43.749167657Z caller=stdlib.go:105 caller=reverseproxy.go:489 msg="http: proxy error: context canceled" level=warn name=lokistack-gateway ts=2022-08-04T22:28:43.749228224Z caller=instrumentation.go:33 request=lokistack-dev-gateway-84996dbb9-6dxwl/2jSiMJWyj5-186110 proto=HTTP/1.1 method=GET status=502 content= path=/api/logs/v1/infrastructure/loki/api/v1/query_range duration=30.001512461s bytes=0
OCP Version: 4.10
How reproducible: Always
Actual Result:
Alert is not firing
Expected Result:
LokiRequestErrors Alert should be firing when most requests are responded by 5xx error
- links to