-
Bug
-
Resolution: Done
-
Undefined
-
None
-
None
-
1
-
False
-
None
-
False
-
-
-
Tracing Sprint # 241, Tracing Sprint # 243, Tracing Sprint # 244, Tracing Sprint # 245, Tracing Sprint # 246
Description of the problem:
When a TempoStack is created on a OCP cluster with only IPV6 networking stack, the compactor and ingestor pods run in CrashLoopBackOff with the following errors.
$ oc logs tempo-tempostack-compactor-7c9469b6b9-gfkqq level=info ts=2023-06-01T05:55:29.618673142Z caller=main.go:197 msg="initialising OpenTracing tracer" level=info ts=2023-06-01T05:55:29.675237393Z caller=main.go:114 msg="Starting Tempo" version="(version=HEAD-4f4282f, branch=HEAD, revision=4f4282f8)" level=info ts=2023-06-01T05:55:29.681174588Z caller=server.go:323 http=[::]:3200 grpc=[::]:9095 msg="server listening on addresses" level=warn ts=2023-06-01T05:55:29.681581106Z caller=util.go:181 msg="error getting interface" inf=en0 err="route ip+net: no such network interface" level=info ts=2023-06-01T05:55:29.681606675Z caller=memberlist_client.go:437 msg="Using memberlist cluster label and node name" cluster_label= node=tempo-tempostack-compactor-7c9469b6b9-gfkqq-322b11f5 level=error ts=2023-06-01T05:55:29.681618581Z caller=main.go:117 msg="error running Tempo" err="failed to init module services error initialising module: compactor: failed to create compactor No address found for [eth0 en0]"
$ oc logs tempo-tempostack-ingester-0 level=info ts=2023-06-01T05:55:37.986122948Z caller=main.go:197 msg="initialising OpenTracing tracer" level=info ts=2023-06-01T05:55:37.986710149Z caller=main.go:114 msg="Starting Tempo" version="(version=HEAD-4f4282f, branch=HEAD, revision=4f4282f8)" level=info ts=2023-06-01T05:55:37.993210743Z caller=server.go:323 http=[::]:3200 grpc=[::]:9095 msg="server listening on addresses" level=error ts=2023-06-01T05:55:37.993450954Z caller=main.go:117 msg="error running Tempo" err="failed to init module services error initialising module: ingester: failed to create ingester: NewLifecycler failed: No address found for [eth0]"
Version of components:
opentelemetry-operator.v0.74.0-5
tempo-operator.v0.1.0-6
OCP Server Version: 4.14.0-0.nightly-2023-05-31-080250
How Reproducible:
Always
Steps to reproduce the issue:
*Deploy a OCP cluster with IPV6 single networking stack.
*Install the Distributed Tracing data collection and Tempo Operator.
*Create a minio instance with the following config.
$ cat minio.yaml
apiVersion: v1
kind: Namespace
metadata:
name: minio
—
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
labels:
app.kubernetes.io/name: minio
name: minio
namespace: minio
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi
—
apiVersion: apps/v1
kind: Deployment
metadata:
name: minio
namespace: minio
spec:
selector:
matchLabels:
app.kubernetes.io/name: minio
strategy:
type: Recreate
template:
metadata:
labels:
app.kubernetes.io/name: minio
spec:
containers:
- command:
- /bin/sh
- -c
- |
mkdir -p /storage/tempo && \
minio server /storage
env:
- name: MINIO_ACCESS_KEY
value: tempo
- name: MINIO_SECRET_KEY
value: supersecret
image: minio/minio
name: minio
ports:
- containerPort: 9000
volumeMounts:
- mountPath: /storage
name: storage
volumes:
- name: storage
persistentVolumeClaim:
claimName: minio
—
apiVersion: v1
kind: Service
metadata:
name: minio
namespace: minio
spec:
ports:
- port: 9000
protocol: TCP
targetPort: 9000
selector:
app.kubernetes.io/name: minio
type: ClusterIP
—
apiVersion: v1
kind: Secret
metadata:
name: minio-test
namespace: minio
stringData:
endpoint: http://minio.minio.svc:9000
bucket: tempo
access_key_id: tempo
access_key_secret: supersecret
type: Opaque
*Deploy the TempoStack with the following config.
oc project openshift-operators
$ cat minio-tempostack.yaml — apiVersion: v1 kind: Secret metadata: name: minio-secret namespace: openshift-operators stringData: endpoint: http://minio.minio.svc:9000 bucket: tempo access_key_id: tempo access_key_secret: supersecret type: Opaque — apiVersion: tempo.grafana.com/v1alpha1 kind: TempoStack metadata: name: tempostack namespace: openshift-operators spec: observability: tracing: jaeger_agent_endpoint: '127.0.0.1:6831' resources: total: limits: cpu: 2000m memory: 2Gi template: queryFrontend: jaegerQuery: enabled: true ingress: route: termination: edge type: route storage: secret: type: s3 name: minio-secret storageSize: 20Gi storageClassName: nfs
*Check the TempoStack pods collector and ingestor are in CrashLoopBackoff
$ oc get pods NAME READY STATUS RESTARTS AGE opentelemetry-operator-controller-manager-78c57648f6-mh4s6 2/2 Running 0 132m tempo-operator-controller-manager-76c49b54b7-zrxvz 2/2 Running 0 132m tempo-tempostack-compactor-7c9469b6b9-gfkqq 0/1 CrashLoopBackOff 6 (3m58s ago) 9m39s tempo-tempostack-distributor-5cb86947b6-84g6b 1/1 Running 0 9m39s tempo-tempostack-ingester-0 0/1 CrashLoopBackOff 6 (3m50s ago) 9m39s tempo-tempostack-querier-6d5f45d987-n7wp8 1/1 Running 0 9m39s tempo-tempostack-query-frontend-6cbc9c484c-bdt6f 2/2 Running 0 9m39s
*Check the pod logs.
$ oc logs tempo-tempostack-compactor-7c9469b6b9-gfkqq level=info ts=2023-06-01T05:55:29.618673142Z caller=main.go:197 msg="initialising OpenTracing tracer" level=info ts=2023-06-01T05:55:29.675237393Z caller=main.go:114 msg="Starting Tempo" version="(version=HEAD-4f4282f, branch=HEAD, revision=4f4282f8)" level=info ts=2023-06-01T05:55:29.681174588Z caller=server.go:323 http=[::]:3200 grpc=[::]:9095 msg="server listening on addresses" level=warn ts=2023-06-01T05:55:29.681581106Z caller=util.go:181 msg="error getting interface" inf=en0 err="route ip+net: no such network interface" level=info ts=2023-06-01T05:55:29.681606675Z caller=memberlist_client.go:437 msg="Using memberlist cluster label and node name" cluster_label= node=tempo-tempostack-compactor-7c9469b6b9-gfkqq-322b11f5 level=error ts=2023-06-01T05:55:29.681618581Z caller=main.go:117 msg="error running Tempo" err="failed to init module services error initialising module: compactor: failed to create compactor No address found for [eth0 en0]"
$ oc logs tempo-tempostack-ingester-0 level=info ts=2023-06-01T05:55:37.986122948Z caller=main.go:197 msg="initialising OpenTracing tracer" level=info ts=2023-06-01T05:55:37.986710149Z caller=main.go:114 msg="Starting Tempo" version="(version=HEAD-4f4282f, branch=HEAD, revision=4f4282f8)" level=info ts=2023-06-01T05:55:37.993210743Z caller=server.go:323 http=[::]:3200 grpc=[::]:9095 msg="server listening on addresses" level=error ts=2023-06-01T05:55:37.993450954Z caller=main.go:117 msg="error running Tempo" err="failed to init module services error initialising module: ingester: failed to create ingester: NewLifecycler failed: No address found for [eth0]"
Additional Details:
$ oc describe pod tempo-tempostack-compactor-7c9469b6b9-gfkqq Name: tempo-tempostack-compactor-7c9469b6b9-gfkqq Namespace: openshift-operators Priority: 0 Node: worker-01.ikanse-108.qe.devcluster.openshift.com/2604:1380:4642:7e00::19 Start Time: Thu, 01 Jun 2023 11:19:48 +0530 Labels: app.kubernetes.io/component=compactor app.kubernetes.io/instance=tempostack app.kubernetes.io/managed-by=tempo-operator app.kubernetes.io/name=tempo pod-template-hash=7c9469b6b9 tempo-gossip-member=true Annotations: k8s.ovn.org/pod-networks: {"default":{"ip_addresses":["fd01:0:0:6::2d/64"],"mac_address":"0a:58:c7:cf:f1:1f","gateway_ips":["fd01:0:0:6::1"],"ip_address":"fd01:0:0:... k8s.v1.cni.cncf.io/network-status: [{ "name": "ovn-kubernetes", "interface": "eth0", "ips": [ "fd01:0:0:6::2d" ], "mac": "0a:58:c7:cf:f1:1f", "default": true, "dns": {} }] openshift.io/scc: restricted-v2 seccomp.security.alpha.kubernetes.io/pod: runtime/default tempo.grafana.com/config.hash: c5230ea44f30b111e8693b59fe53da367377ec8d967c126137101049ed8f7978 Status: Running IP: fd01:0:0:6::2d IPs: IP: fd01:0:0:6::2d Controlled By: ReplicaSet/tempo-tempostack-compactor-7c9469b6b9 Containers: tempo: Container ID: cri-o://f7314fa061d1d0b858e990a7f8b4043b9306244cad5cd3f804c42d61a7dec659 Image: registry.redhat.io/rhosdt/tempo-rhel8@sha256:2ec3a4feac7b282b9a489112662e2e9d080b085a624a72dfeece2cc2389680d4 Image ID: registry.redhat.io/rhosdt/tempo-rhel8@sha256:2ec3a4feac7b282b9a489112662e2e9d080b085a624a72dfeece2cc2389680d4 Ports: 3200/TCP, 7946/TCP Host Ports: 0/TCP, 0/TCP Args: -target=compactor -config.file=/conf/tempo.yaml --storage.trace.s3.secret_key=$(S3_SECRET_KEY) --storage.trace.s3.access_key=$(S3_ACCESS_KEY) State: Waiting Reason: CrashLoopBackOff Last State: Terminated Reason: Error Exit Code: 1 Started: Thu, 01 Jun 2023 11:25:29 +0530 Finished: Thu, 01 Jun 2023 11:25:29 +0530 Ready: False Restart Count: 6 Limits: cpu: 320m memory: 386547072 Requests: cpu: 96m memory: 115964128 Readiness: http-get http://:http/ready delay=15s timeout=1s period=10s #success=1 #failure=3 Environment: S3_SECRET_KEY: <set to the key 'access_key_secret' in secret 'minio-secret'> Optional: false S3_ACCESS_KEY: <set to the key 'access_key_id' in secret 'minio-secret'> Optional: false Mounts: /conf from tempo-conf (ro) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-54f7p (ro) /var/tempo from tempo-tmp-storage (rw) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: tempo-conf: Type: ConfigMap (a volume populated by a ConfigMap) Name: tempo-tempostack Optional: false tempo-tmp-storage: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: SizeLimit: <unset> kube-api-access-54f7p: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: <nil> DownwardAPI: true ConfigMapName: openshift-service-ca.crt ConfigMapOptional: <nil> QoS Class: Burstable Node-Selectors: <none> Tolerations: node.kubernetes.io/memory-pressure:NoSchedule op=Exists node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 10m default-scheduler Successfully assigned openshift-operators/tempo-tempostack-compactor-7c9469b6b9-gfkqq to worker-01.ikanse-108.qe.devcluster.openshift.com Normal AddedInterface 10m multus Add eth0 [fd01:0:0:6::2d/64] from ovn-kubernetes Normal Pulled 8m58s (x5 over 10m) kubelet Container image "registry.redhat.io/rhosdt/tempo-rhel8@sha256:2ec3a4feac7b282b9a489112662e2e9d080b085a624a72dfeece2cc2389680d4" already present on machine Normal Created 8m58s (x5 over 10m) kubelet Created container tempo Normal Started 8m58s (x5 over 10m) kubelet Started container tempo Warning BackOff 21s (x51 over 10m) kubelet Back-off restarting failed container tempo in pod tempo-tempostack-compactor-7c9469b6b9-gfkqq_openshift-operators(499c2eed-7254-449e-9ff6-3dd3f56f1859)
$ oc describe pod tempo-tempostack-ingester-0 Name: tempo-tempostack-ingester-0 Namespace: openshift-operators Priority: 0 Node: worker-02.ikanse-108.qe.devcluster.openshift.com/2604:1380:4642:7e00::15 Start Time: Thu, 01 Jun 2023 11:19:48 +0530 Labels: app.kubernetes.io/component=ingester app.kubernetes.io/instance=tempostack app.kubernetes.io/managed-by=tempo-operator app.kubernetes.io/name=tempo controller-revision-hash=tempo-tempostack-ingester-dc67cb949 statefulset.kubernetes.io/pod-name=tempo-tempostack-ingester-0 tempo-gossip-member=true Annotations: k8s.ovn.org/pod-networks: {"default":{"ip_addresses":["fd01:0:0:5::2c/64"],"mac_address":"0a:58:06:a4:53:1b","gateway_ips":["fd01:0:0:5::1"],"ip_address":"fd01:0:0:... k8s.v1.cni.cncf.io/network-status: [{ "name": "ovn-kubernetes", "interface": "eth0", "ips": [ "fd01:0:0:5::2c" ], "mac": "0a:58:06:a4:53:1b", "default": true, "dns": {} }] openshift.io/scc: restricted-v2 seccomp.security.alpha.kubernetes.io/pod: runtime/default tempo.grafana.com/config.hash: c5230ea44f30b111e8693b59fe53da367377ec8d967c126137101049ed8f7978 Status: Running IP: fd01:0:0:5::2c IPs: IP: fd01:0:0:5::2c Controlled By: StatefulSet/tempo-tempostack-ingester Containers: tempo: Container ID: cri-o://703dd2c42053601584e024dc3229e8f43d31402f2095a61daa40261eff9c4d7e Image: registry.redhat.io/rhosdt/tempo-rhel8@sha256:2ec3a4feac7b282b9a489112662e2e9d080b085a624a72dfeece2cc2389680d4 Image ID: registry.redhat.io/rhosdt/tempo-rhel8@sha256:2ec3a4feac7b282b9a489112662e2e9d080b085a624a72dfeece2cc2389680d4 Ports: 7946/TCP, 3200/TCP, 9095/TCP Host Ports: 0/TCP, 0/TCP, 0/TCP Args: -target=ingester -config.file=/conf/tempo.yaml --storage.trace.s3.secret_key=$(S3_SECRET_KEY) --storage.trace.s3.access_key=$(S3_ACCESS_KEY) State: Waiting Reason: CrashLoopBackOff Last State: Terminated Reason: Error Exit Code: 1 Started: Thu, 01 Jun 2023 11:25:37 +0530 Finished: Thu, 01 Jun 2023 11:25:37 +0530 Ready: False Restart Count: 6 Limits: cpu: 760m memory: 1Gi Requests: cpu: 228m memory: 322122560 Readiness: http-get http://:http/ready delay=15s timeout=1s period=10s #success=1 #failure=3 Environment: S3_SECRET_KEY: <set to the key 'access_key_secret' in secret 'minio-secret'> Optional: false S3_ACCESS_KEY: <set to the key 'access_key_id' in secret 'minio-secret'> Optional: false Mounts: /conf from tempo-conf (ro) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-xd68p (ro) /var/tempo from data (rw) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: data: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: data-tempo-tempostack-ingester-0 ReadOnly: false tempo-conf: Type: ConfigMap (a volume populated by a ConfigMap) Name: tempo-tempostack Optional: false kube-api-access-xd68p: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: <nil> DownwardAPI: true ConfigMapName: openshift-service-ca.crt ConfigMapOptional: <nil> QoS Class: Burstable Node-Selectors: <none> Tolerations: node.kubernetes.io/memory-pressure:NoSchedule op=Exists node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 10m default-scheduler Successfully assigned openshift-operators/tempo-tempostack-ingester-0 to worker-02.ikanse-108.qe.devcluster.openshift.com Normal AddedInterface 10m multus Add eth0 [fd01:0:0:5::2c/64] from ovn-kubernetes Normal Pulled 9m13s (x5 over 10m) kubelet Container image "registry.redhat.io/rhosdt/tempo-rhel8@sha256:2ec3a4feac7b282b9a489112662e2e9d080b085a624a72dfeece2cc2389680d4" already present on machine Normal Created 9m13s (x5 over 10m) kubelet Created container tempo Normal Started 9m12s (x5 over 10m) kubelet Started container tempo Warning BackOff 34s (x53 over 10m) kubelet Back-off restarting failed container tempo in pod tempo-tempostack-ingester-0_openshift-operators(d004bc0e-4239-48e3-9468-3862ab292e66)
$ oc debug tempo-operator-controller-manager-76c49b54b7-zrxvz Defaulting container name to kube-rbac-proxy. Use 'oc describe pod/tempo-operator-controller-manager-76c49b54b7-zrxvz-debug -n openshift-operators' to see all of the containers in this pod. Starting pod/tempo-operator-controller-manager-76c49b54b7-zrxvz-debug ... Pod IP: fd01:0:0:6::33 If you don't see a command prompt, try pressing enter. sh-4.4$ ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0@if62: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UP group default link/ether 0a:58:ee:9f:59:56 brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet6 fd01:0:0:6::33/64 scope global valid_lft forever preferred_lft forever inet6 fe80::858:eeff:fe9f:5956/64 scope link valid_lft forever preferred_lft forever sh-4.4$