-
Bug
-
Resolution: Done
-
Critical
-
None
-
None
-
None
-
1
-
False
-
None
-
False
-
Impediment
-
-
-
Tracing Sprint # 242, Tracing Sprint # 260
-
Customer Escalated
Tempo ingester throws following error when querier tries to read data from it ("the live data")
level=info ts=2023-08-30T16:32:31.563913106Z caller=main.go:215 msg="initialising OpenTracing tracer"level=info ts=2023-08-30T16:32:31.565248933Z caller=main.go:102 msg="Starting Tempo" version="(version=2.1.1, branch=HEAD, revision=4157d7620)"level=info ts=2023-08-30T16:32:31.574473679Z caller=server.go:334 http=[::]:3200 grpc=[::]:9095 msg="server listening on addresses"level=info ts=2023-08-30T16:32:31.575350273Z caller=memberlist_client.go:437 msg="Using memberlist cluster label and node name" cluster_label= node=tempo-simplest36-ingester-0-e7855eb6level=info ts=2023-08-30T16:32:31.575454147Z caller=module_service.go:82 msg=initialising module=internal-serverlevel=info ts=2023-08-30T16:32:31.575501301Z caller=module_service.go:82 msg=initialising module=storelevel=info ts=2023-08-30T16:32:31.575529005Z caller=module_service.go:82 msg=initialising module=serverlevel=info ts=2023-08-30T16:32:31.575616997Z caller=module_service.go:82 msg=initialising module=memberlist-kvlevel=info ts=2023-08-30T16:32:31.575637887Z caller=module_service.go:82 msg=initialising module=overrideslevel=info ts=2023-08-30T16:32:31.575692647Z caller=module_service.go:82 msg=initialising module=ingesterlevel=info ts=2023-08-30T16:32:31.575713835Z caller=ingester.go:327 msg="beginning wal replay"level=warn ts=2023-08-30T16:32:31.575991304Z caller=wal.go:112 msg="unowned file entry ignored during wal replay" file=blocks err=nulllevel=info ts=2023-08-30T16:32:31.576022441Z caller=ingester.go:365 msg="wal replay complete"level=info ts=2023-08-30T16:32:31.576395076Z caller=ingester.go:379 msg="reloading local blocks" tenants=0level=info ts=2023-08-30T16:32:31.576473854Z caller=app.go:188 msg="Tempo started"level=info ts=2023-08-30T16:32:31.576693103Z caller=memberlist_client.go:543 msg="memberlist fast-join starting" nodes_found=1 to_join=2level=info ts=2023-08-30T16:32:31.580934342Z caller=memberlist_client.go:563 msg="memberlist fast-join finished" joined_nodes=2 elapsed_time=4.245506mslevel=info ts=2023-08-30T16:32:31.580970147Z caller=memberlist_client.go:576 msg="joining memberlist cluster" join_members=tempo-simplest36-gossip-ringlevel=info ts=2023-08-30T16:32:31.580996341Z caller=lifecycler.go:576 msg="instance not found in ring, adding with no tokens" ring=ingesterlevel=info ts=2023-08-30T16:32:31.581205744Z caller=lifecycler.go:416 msg="auto-joining cluster after timeout" ring=ingesterlevel=info ts=2023-08-30T16:32:31.583029353Z caller=memberlist_client.go:595 msg="joining memberlist cluster succeeded" reached_nodes=2 elapsed_time=2.05928mspanic: runtime error: index out of range [7] with length 7 goroutine 1272 [running]:github.com/segmentio/parquet-go.(*byteArrayPage).index(...) /home/ploffay/projects/grafana/tempo/vendor/github.com/segmentio/parquet-go/page.go:982github.com/segmentio/parquet-go.(*byteArrayDictionary).lookupString(0xc0001f15e0, {0xc000ffe000, 0x33, 0x400}, {{0xc0011e0000, 0x33, 0x18}}) /home/ploffay/projects/grafana/tempo/vendor/github.com/segmentio/parquet-go/dictionary_purego.go:42 +0x162github.com/segmentio/parquet-go.(*byteArrayDictionary).Lookup(0xc0001f15e0, {0xc000ffe000, 0x33, 0x400}, {0xc0011e0000, 0x33, 0x3e8}) /home/ploffay/projects/grafana/tempo/vendor/github.com/segmentio/parquet-go/dictionary.go:748 +0x142github.com/segmentio/parquet-go.(*indexedPageValues).ReadValues(0xc0011321c0, {0xc0011e0000, 0x33, 0x3e8}) /home/ploffay/projects/grafana/tempo/vendor/github.com/segmentio/parquet-go/dictionary.go:1332 +0xd6github.com/segmentio/parquet-go.(*repeatedPageValues).ReadValues(0xc000c8a060, {0xc0011e0000, 0x3e8, 0x3e8}) /home/ploffay/projects/grafana/tempo/vendor/github.com/segmentio/parquet-go/page_values.go:98 +0x192github.com/grafana/tempo/pkg/parquetquery.(*ColumnIterator).iterate.func3.2(0xc0011dff48, 0xc0011dfec8, 0xc00021c9c0, {0xc0011e0000, 0x3e8, 0x3e8}, 0x3e8, {0x283dc58, 0xc0006261b0}, {0x2851920, ...}) /home/ploffay/projects/grafana/tempo/pkg/parquetquery/iters.go:406 +0x26cgithub.com/grafana/tempo/pkg/parquetquery.(*ColumnIterator).iterate.func3(0xc00021c9c0, 0xc0011dff48, 0xc0011dfec8, {0xc0011e0000, 0x3e8, 0x3e8}, 0x3e8, {0x283dc58, 0xc0006261b0}, {0x2849b48, ...}) /home/ploffay/projects/grafana/tempo/pkg/parquetquery/iters.go:459 +0x1e4github.com/grafana/tempo/pkg/parquetquery.(*ColumnIterator).iterate(0xc00021c9c0, {0x283dc58, 0xc0006261b0}, 0x3e8) /home/ploffay/projects/grafana/tempo/pkg/parquetquery/iters.go:464 +0x676github.com/grafana/tempo/pkg/parquetquery.NewColumnIterator.func1() /home/ploffay/projects/grafana/tempo/pkg/parquetquery/iters.go:299 +0x42created by github.com/grafana/tempo/pkg/parquetquery.(*ColumnIterator).next /home/ploffay/projects/grafana/tempo/pkg/parquetquery/iters.go:485 +0x60
The following error appears in querier:
level=warn ts=2023-08-30T16:35:40.234230788Z caller=logging.go:111 traceID=0d4b5479201902d0 msg="GET /querier/api/search?end=1693413323&limit=20&maxDuration=0s&minDuration=0s&start=1693411540&tags=service.name%3Dmysql (500) 676.589µs Response: \"error querying ingesters in Querier.Search: failed to get response from ingesters: failed to execute f() for 10.120.3.23:9095: rpc error: code = Unavailable desc = connection error: desc = \\\"transport: Error while dialing dial tcp 10.120.3.23:9095: connect: connection refused\\\"\\n\" ws: false; Accept-Encoding: gzip; User-Agent: Go-http-client/1.1; X-Scope-Orgid: single-tenant; "
The issue does not appear on other architectures (e.g. IBM P).
Reported issue in the upstream: https://github.com/parquet-go/parquet-go/issues/56
I have tried rebuilding Tempo with -tags purego (which is used by the parquet) and I got the same result
GOOS=linux GOARCH=s390x make tempo (add -tags purego in Makefile) docker buildx build --push --platform linux/s390x --tag pavolloffay/tempo-ibmz-go120:2.1.1 --build-arg=TARGETARCH=s390x -f ./cmd/tempo/Dockerfile .
Tempo CR:
kubectl apply -f - <<EOF apiVersion: tempo.grafana.com/v1alpha1 kind: TempoStack metadata: name: simplest36 spec: images: tempo: pavolloffay/tempo-ibmz-go120:2.1.1 resources: total: limits: cpu: '2' memory: 2Gi search: defaultResultLimit: 20 maxDuration: 0s managementState: Managed limits: global: ingestion: {} query: maxSearchDuration: 0s template: compactor: {} distributor: replicas: 1 gateway: component: {} enabled: false ingress: route: {} ingester: replicas: 1 querier: {} queryFrontend: component: {} jaegerQuery: enabled: true ingress: route: termination: edge type: route replicationFactor: 1 storage: secret: name: minio-test type: s3 storageSize: 200M retention: global: traces: 48h0m0s EOF
- is incorporated by
-
TRACING-3360 Release notes
- Closed
- links to