Uploaded image for project: 'Distributed Tracing'
  1. Distributed Tracing
  2. TRACING-3630

[Upstream] The Jaeger query pod fails due to an oauth-proxy with an invalid configuration.

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Undefined Undefined
    • None
    • None
    • Jaeger
    • None
    • False
    • None
    • False
    • Tracing Sprint # 243

      Version of components:

      jaeger-operator.v1.49.0

      Description of the problem:

      Except for a few, all our Jaeger e2e test cases fail due to an oauth-proxy error in the query pod caused by an invalid configuration.

      Steps to reproduce the issue:
      *Build the operator and bundle image of the main branch of jaeger operator repo.

      *Install the bundle.

      oc new-project kuttl-test-huge-porpoise

      *Create the jaeger instance.
      $ cat bug.yaml 

      apiVersion: jaegertracing.io/v1
      kind: Jaeger
      metadata:
        labels:
          jaegertracing.io/operated-by: jaeger-operator.jaeger-operator
        name: my-jaeger
        namespace: kuttl-test-huge-porpoise
      spec:
        agent:
          config: {}
          options: {}
          resources: {}
        allInOne:
          config: {}
          metricsStorage: {}
          options: {}
          resources: {}
        collector:
          config: {}
          kafkaSecretName: ""
          options: {}
          resources: {}
        ingester:
          config: {}
          kafkaSecretName: ""
          options: {}
          resources: {}
        ingress:
          openshift:
            delegateUrls: '{"/":{"namespace": "kuttl-test-huge-porpoise", "resource": "pods",
              "verb": "get"}}'
            sar: ' {"namespace": "kuttl-test-huge-porpoise", "resource": "pods", "verb":         "get"}
      '
          options:
            pass-access-token: true
            pass-basic-auth: false
            pass-user-bearer-token: true
            scope: user:info user:check-access
          resources: {}
          security: oauth-proxy
        query:
          metricsStorage: {}
          options:
            query.bearer-token-propagation: true
          resources: {}
        resources: {}
        sampling:
          options: {}
        storage:
          cassandraCreateSchema: {}
          dependencies:
            resources: {}
            schedule: 55 23 * * *
          elasticsearch:
            name: elasticsearch
            nodeCount: 1
            redundancyPolicy: ZeroRedundancy
            resources:
              limits:
                memory: 2Gi
              requests:
                cpu: 200m
                memory: 2Gi
            storage: {}
          esIndexCleaner:
            enabled: true
            numberOfDays: 7
            resources: {}
            schedule: 55 23 * * *
          esRollover:
            resources: {}
            schedule: 0 0 * * *
          grpcPlugin: {}
          options: {}
          type: elasticsearch
        strategy: production
        ui:
          options:
            dependencies:
              menuEnabled: false
            menu:
            - anchorTarget: _self
              label: Log Out
              url: /oauth/sign_in
      

      *Check that the query pods are failing.

      $ oc get pods
      NAME                                                              READY   STATUS             RESTARTS       AGE
      elasticsearch-cdm-kuttltesthugeporpoisemyjaeger-1-7c7f9456cnhv4   2/2     Running            0              5m20s
      my-jaeger-collector-7cff5c49-b4rf7                                1/1     Running            0              4m48s
      my-jaeger-query-8645bb4458-8cgwt                                  2/3     CrashLoopBackOff   5 (115s ago)   4m48s

      *Check the query pod and oauth-proxy container.

      Name:             my-jaeger-query-bd4fb67f9-r4svg
      Namespace:        kuttl-test-huge-porpoise
      Priority:         0
      Service Account:  my-jaeger-ui-proxy
      Node:             ip-10-0-147-24.ec2.internal/10.0.147.24
      Start Time:       Thu, 12 Oct 2023 13:45:28 +0530
      Labels:           app=jaeger
                        app.kubernetes.io/component=query
                        app.kubernetes.io/instance=my-jaeger
                        app.kubernetes.io/managed-by=jaeger-operator
                        app.kubernetes.io/name=my-jaeger-query
                        app.kubernetes.io/part-of=jaeger
                        pod-template-hash=bd4fb67f9
      Annotations:      k8s.v1.cni.cncf.io/network-status:
                          [{
                              "name": "openshift-sdn",
                              "interface": "eth0",
                              "ips": [
                                  "10.129.2.33"
                              ],
                              "default": true,
                              "dns": {}
                          }]
                        linkerd.io/inject: disabled
                        openshift.io/scc: restricted-v2
                        prometheus.io/port: 16687
                        prometheus.io/scrape: true
                        seccomp.security.alpha.kubernetes.io/pod: runtime/default
                        sidecar.istio.io/inject: false
                        sidecar.jaegertracing.io/inject: my-jaeger
      Status:           Running
      IP:               10.129.2.33
      IPs:
        IP:           10.129.2.33
      Controlled By:  ReplicaSet/my-jaeger-query-bd4fb67f9
      Containers:
        jaeger-query:
          Container ID:  cri-o://bea7edf790fbb9ebe2ac2386d306223ad91aacc55dd244ab29f71b4ad08c63f1
          Image:         jaegertracing/jaeger-query:1.49.0
          Image ID:      docker.io/jaegertracing/jaeger-query@sha256:566a13a909de4189a9f71128a11e85e4d904b8f131ed52f039beaa58e5a6ab00
          Ports:         16685/TCP, 16686/TCP, 16687/TCP
          Host Ports:    0/TCP, 0/TCP, 0/TCP
          Args:
            --query.bearer-token-propagation=true
            --query.ui-config=/etc/config/ui.json
            --es.server-urls=https://elasticsearch.kuttl-test-huge-porpoise.svc.cluster.local:9200
            --es.tls.enabled=true
            --es.tls.ca=/certs/ca
            --es.tls.cert=/certs/cert
            --es.tls.key=/certs/key
            --es.timeout=15s
            --es.num-shards=1
            --es.num-replicas=0
          State:          Running
            Started:      Thu, 12 Oct 2023 13:45:30 +0530
          Ready:          True
          Restart Count:  0
          Liveness:       http-get http://:16687/ delay=5s timeout=1s period=15s #success=1 #failure=5
          Readiness:      http-get http://:16687/ delay=1s timeout=1s period=10s #success=1 #failure=3
          Environment:
            SPAN_STORAGE_TYPE:     elasticsearch
            METRICS_STORAGE_TYPE:  
            JAEGER_DISABLED:       false
            JAEGER_SERVICE_NAME:   my-jaeger.kuttl-test-huge-porpoise
            JAEGER_PROPAGATION:    jaeger,b3,w3c
          Mounts:
            /certs from certs (ro)
            /etc/config from my-jaeger-ui-configuration-volume (ro)
            /etc/pki/ca-trust/extracted/pem from my-jaeger-trusted-ca (ro)
            /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-dxnqm (ro)
        oauth-proxy:
          Container ID:  cri-o://26f75699e28df96208db79f977680a523aa7a1149c278ca1a2e0c084a0aeaf6e
          Image:         quay.io/openshift/origin-oauth-proxy:4.12
          Image ID:      quay.io/openshift/origin-oauth-proxy@sha256:b6536bfcfaf30a6425d589facd672bae3245f933b2a7399bda3f12e393bd671b
          Port:          8443/TCP
          Host Port:     0/TCP
          Args:
            --cookie-secret=OHlg3zWHYfP1xUsz
            --https-address=:8443
            --openshift-delegate-urls={"/":{"namespace": "kuttl-test-huge-porpoise", "resource": "pods", "verb": "get"}}
            --openshift-sar={"namespace": "kuttl-test-huge-porpoise", "resource": "pods", "verb": "get"}
            --openshift-service-account=my-jaeger-ui-proxy
            --pass-access-token=true
            --pass-basic-auth=false
            --pass-user-bearer-token=true
            --provider=openshift
            --scope=user:info user:check-access
            --tls-cert=/etc/tls/private/tls.crt
            --tls-key=/etc/tls/private/tls.key
            --upstream=http://localhost:16686
          State:          Waiting
            Reason:       CrashLoopBackOff
          Last State:     Terminated
            Reason:       Error
            Exit Code:    1
            Started:      Thu, 12 Oct 2023 13:46:11 +0530
            Finished:     Thu, 12 Oct 2023 13:46:11 +0530
          Ready:          False
          Restart Count:  3
          Environment:
            JAEGER_SERVICE_NAME:  my-jaeger.kuttl-test-huge-porpoise
            JAEGER_PROPAGATION:   jaeger,b3,w3c
          Mounts:
            /etc/pki/ca-trust/extracted/pem from my-jaeger-trusted-ca (ro)
            /etc/tls/private from my-jaeger-ui-oauth-proxy-tls (rw)
            /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-dxnqm (ro)
        jaeger-agent:
          Container ID:  cri-o://150a291b8894c2609c0aecb82c2c6faa26c9d71d324f4f89ea12c92c15079e9d
          Image:         jaegertracing/jaeger-agent:1.49.0
          Image ID:      docker.io/jaegertracing/jaeger-agent@sha256:8b7ef48117ceb0c76210dbb19238000015e4017e53c8bd2bb3811a1dde0a777d
          Ports:         5775/UDP, 5778/TCP, 6831/UDP, 6832/UDP, 14271/TCP
          Host Ports:    0/UDP, 0/TCP, 0/UDP, 0/UDP, 0/TCP
          Args:
            --agent.tags=cluster=undefined,deployment.name=my-jaeger-query,host.ip=${HOST_IP:},pod.name=${POD_NAME:},pod.namespace=kuttl-test-huge-porpoise
            --reporter.grpc.host-port=dns:///my-jaeger-collector-headless.kuttl-test-huge-porpoise.svc:14250
            --reporter.grpc.tls.ca=/etc/pki/ca-trust/source/service-ca/service-ca.crt
            --reporter.grpc.tls.enabled=true
          State:          Running
            Started:      Thu, 12 Oct 2023 13:45:30 +0530
          Ready:          True
          Restart Count:  0
          Liveness:       http-get http://:14271/ delay=5s timeout=1s period=15s #success=1 #failure=5
          Readiness:      http-get http://:14271/ delay=1s timeout=1s period=10s #success=1 #failure=3
          Environment:
            POD_NAME:  my-jaeger-query-bd4fb67f9-r4svg (v1:metadata.name)
            HOST_IP:    (v1:status.hostIP)
          Mounts:
            /etc/pki/ca-trust/extracted/pem from my-jaeger-trusted-ca (ro)
            /etc/pki/ca-trust/source/service-ca from my-jaeger-service-ca (ro)
            /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-dxnqm (ro)
      Conditions:
        Type              Status
        Initialized       True 
        Ready             False 
        ContainersReady   False 
        PodScheduled      True 
      Volumes:
        my-jaeger-ui-configuration-volume:
          Type:      ConfigMap (a volume populated by a ConfigMap)
          Name:      my-jaeger-ui-configuration
          Optional:  false
        my-jaeger-trusted-ca:
          Type:      ConfigMap (a volume populated by a ConfigMap)
          Name:      my-jaeger-trusted-ca
          Optional:  false
        my-jaeger-ui-oauth-proxy-tls:
          Type:        Secret (a volume populated by a Secret)
          SecretName:  my-jaeger-ui-oauth-proxy-tls
          Optional:    false
        certs:
          Type:        Secret (a volume populated by a Secret)
          SecretName:  my-jaeger-jaeger-elasticsearch
          Optional:    false
        my-jaeger-service-ca:
          Type:      ConfigMap (a volume populated by a ConfigMap)
          Name:      my-jaeger-service-ca
          Optional:  false
        kube-api-access-dxnqm:
          Type:                    Projected (a volume that contains injected data from multiple sources)
          TokenExpirationSeconds:  3607
          ConfigMapName:           kube-root-ca.crt
          ConfigMapOptional:       <nil>
          DownwardAPI:             true
          ConfigMapName:           openshift-service-ca.crt
          ConfigMapOptional:       <nil>
      QoS Class:                   BestEffort
      Node-Selectors:              <none>
      Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                                   node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
      Events:
        Type     Reason          Age                From               Message
        ----     ------          ----               ----               -------
        Normal   Scheduled       65s                default-scheduler  Successfully assigned kuttl-test-huge-porpoise/my-jaeger-query-bd4fb67f9-r4svg to ip-10-0-147-24.ec2.internal
        Normal   AddedInterface  64s                multus             Add eth0 [10.129.2.33/23] from openshift-sdn
        Normal   Pulled          64s                kubelet            Container image "jaegertracing/jaeger-query:1.49.0" already present on machine
        Normal   Created         63s                kubelet            Created container jaeger-query
        Normal   Started         63s                kubelet            Started container jaeger-query
        Normal   Pulled          63s                kubelet            Container image "jaegertracing/jaeger-agent:1.49.0" already present on machine
        Normal   Created         63s                kubelet            Created container jaeger-agent
        Normal   Started         63s                kubelet            Started container jaeger-agent
        Normal   Pulled          22s (x4 over 63s)  kubelet            Container image "quay.io/openshift/origin-oauth-proxy:4.12" already present on machine
        Normal   Created         22s (x4 over 63s)  kubelet            Created container oauth-proxy
        Normal   Started         22s (x4 over 63s)  kubelet            Started container oauth-proxy
        Warning  BackOff         10s (x6 over 61s)  kubelet            Back-off restarting failed container oauth-proxy in pod my-jaeger-query-bd4fb67f9-r4svg_kuttl-test-huge-porpoise(479472b4-6d07-4028-8706-d37f88269642)
      
      $ oc logs my-jaeger-query-bd4fb67f9-r4svg -c oauth-proxy
      2023/10/12 08:18:29 provider.go:129: Defaulting client-id to system:serviceaccount:kuttl-test-huge-porpoise:my-jaeger-ui-proxy
      2023/10/12 08:18:29 provider.go:134: Defaulting client-secret to service account token /var/run/secrets/kubernetes.io/serviceaccount/token
      2023/10/12 08:18:29 provider.go:358: Delegation of authentication and authorization to OpenShift is enabled for bearer tokens and client certificates.
      2023/10/12 08:18:29 main.go:140: Invalid configuration:
        cookie_secret must be 16, 24, or 32 bytes to create an AES cipher when pass_access_token == true or cookie_refresh != 0, but is 12 bytes. note: cookie secret was base64 decoded from "OHlg3zWHYfP1xUsz"

      The issue is observed in our upstream e2e test job for OCP. 
      https://github.com/jaegertracing/jaeger-operator/pull/2342 
      https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/jaegertracing_jaeger-operator/2342/pull-ci-jaegertracing-jaeger-operator-main-jaeger-e2e-tests/1712058416983183360 

              rhn-support-iblancas Israel Blancas Alvarez
              rhn-support-ikanse Ishwar Kanse
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: