Uploaded image for project: 'Distributed Tracing'
  1. Distributed Tracing
  2. TRACING-3226

TempoStack fails to deploy on a single stack OCP IPV6 cluster.

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • Undefined
    • rhosdt-3.1
    • None
    • Tempo
    • None
    • Tracing Sprint # 241, Tracing Sprint # 243, Tracing Sprint # 244, Tracing Sprint # 245, Tracing Sprint # 246

    Description

      Description of the problem:

      When a TempoStack is created on a OCP cluster with only IPV6 networking stack, the compactor and ingestor pods run in CrashLoopBackOff with the following errors. 

      $ oc logs tempo-tempostack-compactor-7c9469b6b9-gfkqq
      level=info ts=2023-06-01T05:55:29.618673142Z caller=main.go:197 msg="initialising OpenTracing tracer"
      level=info ts=2023-06-01T05:55:29.675237393Z caller=main.go:114 msg="Starting Tempo" version="(version=HEAD-4f4282f, branch=HEAD, revision=4f4282f8)"
      level=info ts=2023-06-01T05:55:29.681174588Z caller=server.go:323 http=[::]:3200 grpc=[::]:9095 msg="server listening on addresses"
      level=warn ts=2023-06-01T05:55:29.681581106Z caller=util.go:181 msg="error getting interface" inf=en0 err="route ip+net: no such network interface"
      level=info ts=2023-06-01T05:55:29.681606675Z caller=memberlist_client.go:437 msg="Using memberlist cluster label and node name" cluster_label= node=tempo-tempostack-compactor-7c9469b6b9-gfkqq-322b11f5
      level=error ts=2023-06-01T05:55:29.681618581Z caller=main.go:117 msg="error running Tempo" err="failed to init module services error initialising module: compactor: failed to create compactor No address found for [eth0 en0]"
      $ oc logs tempo-tempostack-ingester-0
      level=info ts=2023-06-01T05:55:37.986122948Z caller=main.go:197 msg="initialising OpenTracing tracer"
      level=info ts=2023-06-01T05:55:37.986710149Z caller=main.go:114 msg="Starting Tempo" version="(version=HEAD-4f4282f, branch=HEAD, revision=4f4282f8)"
      level=info ts=2023-06-01T05:55:37.993210743Z caller=server.go:323 http=[::]:3200 grpc=[::]:9095 msg="server listening on addresses"
      level=error ts=2023-06-01T05:55:37.993450954Z caller=main.go:117 msg="error running Tempo" err="failed to init module services error initialising module: ingester: failed to create ingester: NewLifecycler failed: No address found for [eth0]"

      Version of components:

      opentelemetry-operator.v0.74.0-5

      tempo-operator.v0.1.0-6

      OCP Server Version: 4.14.0-0.nightly-2023-05-31-080250

      Cluster profile: https://gitlab.cee.redhat.com/aosqe/flexy-templates/-/blob/master/functionality-testing/aos-4_14/upi-on-baremetal/versioned-installer-packet-single_stack-http_proxy

      How Reproducible:

      Always

      Steps to reproduce the issue:

      *Deploy a OCP cluster with IPV6 single networking stack.

      *Install the Distributed Tracing data collection and Tempo Operator.

      *Create a minio instance with the following config.

      $ cat minio.yaml 
      apiVersion: v1
      kind: Namespace
      metadata:
        name: minio
      —
      apiVersion: v1
      kind: PersistentVolumeClaim
      metadata:
        labels:
          app.kubernetes.io/name: minio
        name: minio
        namespace: minio
      spec:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 2Gi
      —
      apiVersion: apps/v1
      kind: Deployment
      metadata:
        name: minio
        namespace: minio
      spec:
        selector:
          matchLabels:
            app.kubernetes.io/name: minio
        strategy:
          type: Recreate
        template:
          metadata:
            labels:
              app.kubernetes.io/name: minio
          spec:
            containers:
              - command:
                  - /bin/sh
                  - -c
                  - |
                    mkdir -p /storage/tempo && \
                    minio server /storage
                env:
                  - name: MINIO_ACCESS_KEY
                    value: tempo
                  - name: MINIO_SECRET_KEY
                    value: supersecret
                image: minio/minio
                name: minio
                ports:
                  - containerPort: 9000
                volumeMounts:
                  - mountPath: /storage
                    name: storage
            volumes:
              - name: storage
                persistentVolumeClaim:
                  claimName: minio
      —
      apiVersion: v1
      kind: Service
      metadata:
        name: minio
        namespace: minio
      spec:
        ports:
          - port: 9000
            protocol: TCP
            targetPort: 9000
        selector:
          app.kubernetes.io/name: minio
        type: ClusterIP
      —
      apiVersion: v1
      kind: Secret
      metadata:
        name: minio-test
        namespace: minio
      stringData:
        endpoint: http://minio.minio.svc:9000
        bucket: tempo
        access_key_id: tempo
        access_key_secret: supersecret
      type: Opaque

      *Deploy the TempoStack with the following config.

      oc project openshift-operators
      $ cat minio-tempostack.yaml 
      —
      apiVersion: v1
      kind: Secret
      metadata:
        name: minio-secret
        namespace: openshift-operators
      stringData:
        endpoint: http://minio.minio.svc:9000
        bucket: tempo
        access_key_id: tempo
        access_key_secret: supersecret
      type: Opaque
      —
      apiVersion: tempo.grafana.com/v1alpha1
      kind: TempoStack
      metadata:
        name: tempostack
        namespace: openshift-operators
      spec:
        observability:
          tracing:
            jaeger_agent_endpoint: '127.0.0.1:6831'
        resources:
          total:
            limits:
              cpu: 2000m
              memory: 2Gi
        template:
          queryFrontend:
            jaegerQuery:
              enabled: true
              ingress:
                route:
                  termination: edge
                type: route
        storage:
          secret:
            type: s3
            name: minio-secret
        storageSize: 20Gi
        storageClassName: nfs
      

      *Check the TempoStack pods collector and ingestor are in CrashLoopBackoff

      $ oc get pods
      NAME                                                         READY   STATUS             RESTARTS        AGE
      opentelemetry-operator-controller-manager-78c57648f6-mh4s6   2/2     Running            0               132m
      tempo-operator-controller-manager-76c49b54b7-zrxvz           2/2     Running            0               132m
      tempo-tempostack-compactor-7c9469b6b9-gfkqq                  0/1     CrashLoopBackOff   6 (3m58s ago)   9m39s
      tempo-tempostack-distributor-5cb86947b6-84g6b                1/1     Running            0               9m39s
      tempo-tempostack-ingester-0                                  0/1     CrashLoopBackOff   6 (3m50s ago)   9m39s
      tempo-tempostack-querier-6d5f45d987-n7wp8                    1/1     Running            0               9m39s
      tempo-tempostack-query-frontend-6cbc9c484c-bdt6f             2/2     Running            0               9m39s

      *Check the pod logs.

      $ oc logs tempo-tempostack-compactor-7c9469b6b9-gfkqq
      level=info ts=2023-06-01T05:55:29.618673142Z caller=main.go:197 msg="initialising OpenTracing tracer"
      level=info ts=2023-06-01T05:55:29.675237393Z caller=main.go:114 msg="Starting Tempo" version="(version=HEAD-4f4282f, branch=HEAD, revision=4f4282f8)"
      level=info ts=2023-06-01T05:55:29.681174588Z caller=server.go:323 http=[::]:3200 grpc=[::]:9095 msg="server listening on addresses"
      level=warn ts=2023-06-01T05:55:29.681581106Z caller=util.go:181 msg="error getting interface" inf=en0 err="route ip+net: no such network interface"
      level=info ts=2023-06-01T05:55:29.681606675Z caller=memberlist_client.go:437 msg="Using memberlist cluster label and node name" cluster_label= node=tempo-tempostack-compactor-7c9469b6b9-gfkqq-322b11f5
      level=error ts=2023-06-01T05:55:29.681618581Z caller=main.go:117 msg="error running Tempo" err="failed to init module services error initialising module: compactor: failed to create compactor No address found for [eth0 en0]"
      $ oc logs tempo-tempostack-ingester-0
      level=info ts=2023-06-01T05:55:37.986122948Z caller=main.go:197 msg="initialising OpenTracing tracer"
      level=info ts=2023-06-01T05:55:37.986710149Z caller=main.go:114 msg="Starting Tempo" version="(version=HEAD-4f4282f, branch=HEAD, revision=4f4282f8)"
      level=info ts=2023-06-01T05:55:37.993210743Z caller=server.go:323 http=[::]:3200 grpc=[::]:9095 msg="server listening on addresses"
      level=error ts=2023-06-01T05:55:37.993450954Z caller=main.go:117 msg="error running Tempo" err="failed to init module services error initialising module: ingester: failed to create ingester: NewLifecycler failed: No address found for [eth0]"

      Additional Details:

      $ oc describe pod tempo-tempostack-compactor-7c9469b6b9-gfkqq
      Name:         tempo-tempostack-compactor-7c9469b6b9-gfkqq
      Namespace:    openshift-operators
      Priority:     0
      Node:         worker-01.ikanse-108.qe.devcluster.openshift.com/2604:1380:4642:7e00::19
      Start Time:   Thu, 01 Jun 2023 11:19:48 +0530
      Labels:       app.kubernetes.io/component=compactor
                    app.kubernetes.io/instance=tempostack
                    app.kubernetes.io/managed-by=tempo-operator
                    app.kubernetes.io/name=tempo
                    pod-template-hash=7c9469b6b9
                    tempo-gossip-member=true
      Annotations:  k8s.ovn.org/pod-networks:
                      {"default":{"ip_addresses":["fd01:0:0:6::2d/64"],"mac_address":"0a:58:c7:cf:f1:1f","gateway_ips":["fd01:0:0:6::1"],"ip_address":"fd01:0:0:...
                    k8s.v1.cni.cncf.io/network-status:
                      [{
                          "name": "ovn-kubernetes",
                          "interface": "eth0",
                          "ips": [
                              "fd01:0:0:6::2d"
                          ],
                          "mac": "0a:58:c7:cf:f1:1f",
                          "default": true,
                          "dns": {}
                      }]
                    openshift.io/scc: restricted-v2
                    seccomp.security.alpha.kubernetes.io/pod: runtime/default
                    tempo.grafana.com/config.hash: c5230ea44f30b111e8693b59fe53da367377ec8d967c126137101049ed8f7978
      Status:       Running
      IP:           fd01:0:0:6::2d
      IPs:
        IP:           fd01:0:0:6::2d
      Controlled By:  ReplicaSet/tempo-tempostack-compactor-7c9469b6b9
      Containers:
        tempo:
          Container ID:  cri-o://f7314fa061d1d0b858e990a7f8b4043b9306244cad5cd3f804c42d61a7dec659
          Image:         registry.redhat.io/rhosdt/tempo-rhel8@sha256:2ec3a4feac7b282b9a489112662e2e9d080b085a624a72dfeece2cc2389680d4
          Image ID:      registry.redhat.io/rhosdt/tempo-rhel8@sha256:2ec3a4feac7b282b9a489112662e2e9d080b085a624a72dfeece2cc2389680d4
          Ports:         3200/TCP, 7946/TCP
          Host Ports:    0/TCP, 0/TCP
          Args:
            -target=compactor
            -config.file=/conf/tempo.yaml
            --storage.trace.s3.secret_key=$(S3_SECRET_KEY)
            --storage.trace.s3.access_key=$(S3_ACCESS_KEY)
          State:          Waiting
            Reason:       CrashLoopBackOff
          Last State:     Terminated
            Reason:       Error
            Exit Code:    1
            Started:      Thu, 01 Jun 2023 11:25:29 +0530
            Finished:     Thu, 01 Jun 2023 11:25:29 +0530
          Ready:          False
          Restart Count:  6
          Limits:
            cpu:     320m
            memory:  386547072
          Requests:
            cpu:      96m
            memory:   115964128
          Readiness:  http-get http://:http/ready delay=15s timeout=1s period=10s #success=1 #failure=3
          Environment:
            S3_SECRET_KEY:  <set to the key 'access_key_secret' in secret 'minio-secret'>  Optional: false
            S3_ACCESS_KEY:  <set to the key 'access_key_id' in secret 'minio-secret'>      Optional: false
          Mounts:
            /conf from tempo-conf (ro)
            /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-54f7p (ro)
            /var/tempo from tempo-tmp-storage (rw)
      Conditions:
        Type              Status
        Initialized       True 
        Ready             False 
        ContainersReady   False 
        PodScheduled      True 
      Volumes:
        tempo-conf:
          Type:      ConfigMap (a volume populated by a ConfigMap)
          Name:      tempo-tempostack
          Optional:  false
        tempo-tmp-storage:
          Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
          Medium:     
          SizeLimit:  <unset>
        kube-api-access-54f7p:
          Type:                    Projected (a volume that contains injected data from multiple sources)
          TokenExpirationSeconds:  3607
          ConfigMapName:           kube-root-ca.crt
          ConfigMapOptional:       <nil>
          DownwardAPI:             true
          ConfigMapName:           openshift-service-ca.crt
          ConfigMapOptional:       <nil>
      QoS Class:                   Burstable
      Node-Selectors:              <none>
      Tolerations:                 node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                                   node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                                   node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
      Events:
        Type     Reason          Age                  From               Message
        ----     ------          ----                 ----               -------
        Normal   Scheduled       10m                  default-scheduler  Successfully assigned openshift-operators/tempo-tempostack-compactor-7c9469b6b9-gfkqq to worker-01.ikanse-108.qe.devcluster.openshift.com
        Normal   AddedInterface  10m                  multus             Add eth0 [fd01:0:0:6::2d/64] from ovn-kubernetes
        Normal   Pulled          8m58s (x5 over 10m)  kubelet            Container image "registry.redhat.io/rhosdt/tempo-rhel8@sha256:2ec3a4feac7b282b9a489112662e2e9d080b085a624a72dfeece2cc2389680d4" already present on machine
        Normal   Created         8m58s (x5 over 10m)  kubelet            Created container tempo
        Normal   Started         8m58s (x5 over 10m)  kubelet            Started container tempo
        Warning  BackOff         21s (x51 over 10m)   kubelet            Back-off restarting failed container tempo in pod tempo-tempostack-compactor-7c9469b6b9-gfkqq_openshift-operators(499c2eed-7254-449e-9ff6-3dd3f56f1859)
       
      
      $ oc describe pod tempo-tempostack-ingester-0
      Name:         tempo-tempostack-ingester-0
      Namespace:    openshift-operators
      Priority:     0
      Node:         worker-02.ikanse-108.qe.devcluster.openshift.com/2604:1380:4642:7e00::15
      Start Time:   Thu, 01 Jun 2023 11:19:48 +0530
      Labels:       app.kubernetes.io/component=ingester
                    app.kubernetes.io/instance=tempostack
                    app.kubernetes.io/managed-by=tempo-operator
                    app.kubernetes.io/name=tempo
                    controller-revision-hash=tempo-tempostack-ingester-dc67cb949
                    statefulset.kubernetes.io/pod-name=tempo-tempostack-ingester-0
                    tempo-gossip-member=true
      Annotations:  k8s.ovn.org/pod-networks:
                      {"default":{"ip_addresses":["fd01:0:0:5::2c/64"],"mac_address":"0a:58:06:a4:53:1b","gateway_ips":["fd01:0:0:5::1"],"ip_address":"fd01:0:0:...
                    k8s.v1.cni.cncf.io/network-status:
                      [{
                          "name": "ovn-kubernetes",
                          "interface": "eth0",
                          "ips": [
                              "fd01:0:0:5::2c"
                          ],
                          "mac": "0a:58:06:a4:53:1b",
                          "default": true,
                          "dns": {}
                      }]
                    openshift.io/scc: restricted-v2
                    seccomp.security.alpha.kubernetes.io/pod: runtime/default
                    tempo.grafana.com/config.hash: c5230ea44f30b111e8693b59fe53da367377ec8d967c126137101049ed8f7978
      Status:       Running
      IP:           fd01:0:0:5::2c
      IPs:
        IP:           fd01:0:0:5::2c
      Controlled By:  StatefulSet/tempo-tempostack-ingester
      Containers:
        tempo:
          Container ID:  cri-o://703dd2c42053601584e024dc3229e8f43d31402f2095a61daa40261eff9c4d7e
          Image:         registry.redhat.io/rhosdt/tempo-rhel8@sha256:2ec3a4feac7b282b9a489112662e2e9d080b085a624a72dfeece2cc2389680d4
          Image ID:      registry.redhat.io/rhosdt/tempo-rhel8@sha256:2ec3a4feac7b282b9a489112662e2e9d080b085a624a72dfeece2cc2389680d4
          Ports:         7946/TCP, 3200/TCP, 9095/TCP
          Host Ports:    0/TCP, 0/TCP, 0/TCP
          Args:
            -target=ingester
            -config.file=/conf/tempo.yaml
            --storage.trace.s3.secret_key=$(S3_SECRET_KEY)
            --storage.trace.s3.access_key=$(S3_ACCESS_KEY)
          State:          Waiting
            Reason:       CrashLoopBackOff
          Last State:     Terminated
            Reason:       Error
            Exit Code:    1
            Started:      Thu, 01 Jun 2023 11:25:37 +0530
            Finished:     Thu, 01 Jun 2023 11:25:37 +0530
          Ready:          False
          Restart Count:  6
          Limits:
            cpu:     760m
            memory:  1Gi
          Requests:
            cpu:      228m
            memory:   322122560
          Readiness:  http-get http://:http/ready delay=15s timeout=1s period=10s #success=1 #failure=3
          Environment:
            S3_SECRET_KEY:  <set to the key 'access_key_secret' in secret 'minio-secret'>  Optional: false
            S3_ACCESS_KEY:  <set to the key 'access_key_id' in secret 'minio-secret'>      Optional: false
          Mounts:
            /conf from tempo-conf (ro)
            /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-xd68p (ro)
            /var/tempo from data (rw)
      Conditions:
        Type              Status
        Initialized       True 
        Ready             False 
        ContainersReady   False 
        PodScheduled      True 
      Volumes:
        data:
          Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
          ClaimName:  data-tempo-tempostack-ingester-0
          ReadOnly:   false
        tempo-conf:
          Type:      ConfigMap (a volume populated by a ConfigMap)
          Name:      tempo-tempostack
          Optional:  false
        kube-api-access-xd68p:
          Type:                    Projected (a volume that contains injected data from multiple sources)
          TokenExpirationSeconds:  3607
          ConfigMapName:           kube-root-ca.crt
          ConfigMapOptional:       <nil>
          DownwardAPI:             true
          ConfigMapName:           openshift-service-ca.crt
          ConfigMapOptional:       <nil>
      QoS Class:                   Burstable
      Node-Selectors:              <none>
      Tolerations:                 node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                                   node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                                   node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
      Events:
        Type     Reason          Age                  From               Message
        ----     ------          ----                 ----               -------
        Normal   Scheduled       10m                  default-scheduler  Successfully assigned openshift-operators/tempo-tempostack-ingester-0 to worker-02.ikanse-108.qe.devcluster.openshift.com
        Normal   AddedInterface  10m                  multus             Add eth0 [fd01:0:0:5::2c/64] from ovn-kubernetes
        Normal   Pulled          9m13s (x5 over 10m)  kubelet            Container image "registry.redhat.io/rhosdt/tempo-rhel8@sha256:2ec3a4feac7b282b9a489112662e2e9d080b085a624a72dfeece2cc2389680d4" already present on machine
        Normal   Created         9m13s (x5 over 10m)  kubelet            Created container tempo
        Normal   Started         9m12s (x5 over 10m)  kubelet            Started container tempo
        Warning  BackOff         34s (x53 over 10m)   kubelet            Back-off restarting failed container tempo in pod tempo-tempostack-ingester-0_openshift-operators(d004bc0e-4239-48e3-9468-3862ab292e66)
      $ oc debug tempo-operator-controller-manager-76c49b54b7-zrxvz
      Defaulting container name to kube-rbac-proxy.
      Use 'oc describe pod/tempo-operator-controller-manager-76c49b54b7-zrxvz-debug -n openshift-operators' to see all of the containers in this pod.
      Starting pod/tempo-operator-controller-manager-76c49b54b7-zrxvz-debug ...
      Pod IP: fd01:0:0:6::33
      If you don't see a command prompt, try pressing enter.
      sh-4.4$ ip a
      1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
          link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
          inet 127.0.0.1/8 scope host lo
             valid_lft forever preferred_lft forever
          inet6 ::1/128 scope host 
             valid_lft forever preferred_lft forever
      2: eth0@if62: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UP group default 
          link/ether 0a:58:ee:9f:59:56 brd ff:ff:ff:ff:ff:ff link-netnsid 0
          inet6 fd01:0:0:6::33/64 scope global 
             valid_lft forever preferred_lft forever
          inet6 fe80::858:eeff:fe9f:5956/64 scope link 
             valid_lft forever preferred_lft forever
      sh-4.4$
      

       

       

      Attachments

        Activity

          People

            agerstma@redhat.com Andreas Gerstmayr
            rhn-support-ikanse Ishwar Kanse
            Ishwar Kanse Ishwar Kanse
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: