-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
None
-
None
-
5
-
False
-
-
False
-
subs-swatch-thunder
-
-
-
Critical
Context
Going to the pod logs for the swatch-api-nginx-proxy in the production environment, we do see logs like:
10.128.40.197 - - [15/Sep/2025:13:42:36 +0000] "GET /api/rhsm-subscriptions/v1/tally/products/OpenShift%20Container%20Platform/Sockets?ending=2025-09-15T23:59:59.999Z&granularity=Daily&beginning=2025-08-16T00:00:00.000Z HTTP/1.1" 200 1944 "https://console.redhat.com/subscriptions/usage/openshift" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/140.0.0.0 Safari/537.36"
However, if we go to splunk and use the following query: "index=rh_rhsm namespace=rhsm-prod "/api/rhsm-subscriptions/v1/tally/products"", nothing is found. Actually, we should see the traces by using the service name: "index=rh_rhsm namespace=rhsm-prod source=swatch-api-nginx-proxy", but still no results are found.
More information, I checked the splunk logs and I saw these traces:
2025-09-15T14:06:10.853Z error exporterhelper/queue_sender.go:90 Exporting failed. Dropping data. {"kind": "exporter", "data_type": "metrics", "name": "signalfx", "error": "not retryable error: Permanent error: \"HTTP/2.0 401 Unauthorized\\r\\nContent-Length: 0\\r\\nDate: Mon, 15 Sep 2025 14:06:10 GMT\\r\\nServer: istio-envoy\\r\\nWww-Authenticate: Basic realm=\\\"Splunk\\\"\\r\\nX-Envoy-Upstream-Service-Time: 7\\r\\n\\r\\n\"", "dropped_items": 27} go.opentelemetry.io/collector/exporter/exporterhelper.newQueueSender.func1 go.opentelemetry.io/collector/exporter@v0.104.0/exporterhelper/queue_sender.go:90 go.opentelemetry.io/collector/exporter/internal/queue.(*boundedMemoryQueue[...]).Consume go.opentelemetry.io/collector/exporter@v0.104.0/internal/queue/bounded_memory_queue.go:52 go.opentelemetry.io/collector/exporter/internal/queue.(*Consumers[...]).Start.func1 go.opentelemetry.io/collector/exporter@v0.104.0/internal/queue/consumers.go:43
It fails with an 401 Unauthorized exception when uploading the traces to signalfx. Yet I doubt the issue is related with the above error.
Acceptance Criteria
- Investigate/Fix the splunk configuration for our nginx otel service. Document how to reproduce the error and prove that is fixed.
- Investigate if the signalfx error is related, if not, report another JIRA ticket for further investigation.
Links
- Nginx Otel github repository: https://github.com/RedHatInsights/nginx-otel
- is blocked by
-
SWATCH-3977 Make traces for nginx proxy appear in splunk logs
-
- Release Pending
-
- is related to
-
SWATCH-3961 Improve performance of the "/v1/instances/billing_account_ids" API
-
- Closed
-