Uploaded image for project: 'Distributed Tracing'
  1. Distributed Tracing
  2. TRACING-833

Reproduce and capture debug logs from OSSM-104

XMLWordPrintable

    • Icon: Task Task
    • Resolution: Obsolete
    • Icon: Major Major
    • None
    • None
    • None
    • None
    • Tracing Sprint #33, Tracing Sprint #34

      The issue OSSM-104 describes a scenario that I'm trying to reproduce without success and I need some help from our QE.

      When installing the "Red Hat Service Mesh" operator via OLM (Operator Hub) in an OpenShift cluster, the Jaeger Operator is brought along.

      Once a Service Mesh CR is applied, it provisions a Jaeger instance as well with the following CR:

      {
          "apiVersion":"jaegertracing.io/v1",
          "kind":"Jaeger",
          "metadata":{
              "annotations":{
                  "maistra.io/mesh-generation":"1.0.2-7.el8-2"
              },
              "labels":{
                  "app.kubernetes.io/component":"tracing",
                  "app.kubernetes.io/instance":"istio-system",
                  "app.kubernetes.io/managed-by":"maistra-istio-operator",
                  "app.kubernetes.io/name":"tracing",
                  "app.kubernetes.io/part-of":"istio",
                  "app.kubernetes.io/version":"1.0.2-7.el8-2",
                  "chart":"tracing",
                  "heritage":"Tiller",
                  "maistra.io/owner":"istio-system",
                  "release":"istio"
              },
              "name":"jaeger",
              "namespace":"istio-system",
              "ownerReferences":[
                  {
                      "apiVersion":"maistra.io/v1",
                      "blockOwnerDeletion":true,
                      "controller":true,
                      "kind":"ServiceMeshControlPlane",
                      "name":"full-install",
                      "uid":"9046e1ad-0576-11ea-9317-fa163e155851"
                  }
              ]
          },
          "spec":{
              "affinity":{
                  "nodeAffinity":{
                      "preferredDuringSchedulingIgnoredDuringExecution":[
                          {
                              "preference":{
                                  "matchExpressions":[
                                      {
                                          "key":"beta.kubernetes.io/arch",
                                          "operator":"In",
                                          "values":[
                                              "amd64"
                                          ]
                                      }
                                  ]
                              },
                              "weight":2
                          },
                          {
                              "preference":{
                                  "matchExpressions":[
                                      {
                                          "key":"beta.kubernetes.io/arch",
                                          "operator":"In",
                                          "values":[
                                              "ppc64le"
                                          ]
                                      }
                                  ]
                              },
                                  "weight":2
                          },
                          {
                              "preference":{
                                  "matchExpressions":[
                                      {
                                          "key":"beta.kubernetes.io/arch",
                                          "operator":"In",
                                          "values":[
                                              "s390x"
                                          ]
                                      }
                                  ]
                              },
                              "weight":2
                          }
                      ],
                      "requiredDuringSchedulingIgnoredDuringExecution":{
                          "nodeSelectorTerms":[
                              {
                                  "matchExpressions":[
                                      {
                                          "key":"beta.kubernetes.io/arch",
                                          "operator":"In",
                                          "values":[
                                              "amd64",
                                              "ppc64le",
                                              "s390x"
                                          ]
                                      }
                                  ]
                              }
                          ]
                      }
                  }
              },
              "agent":null,
              "allInOne":{
                  "annotations":null,
                  "options":{
                      "log-level":"debug",
                      "query":{
                          "base-path":"/"
                      }
                  }
              },
              "ingress":{
                  "annotations":null,
                  "enabled":true,
                  "openshift":{
                      "htpasswdFile":"/etc/proxy/htpasswd/auth",
                      "sar":"{\"namespace\": \"istio-system\", \"resource\": \"pods\", \"verb\": \"get\"}"
                  },
                  "security":"oauth-proxy"
              },
              "resources":{
                  "limits":null,
                  "requests":{
                      "cpu":"10m",
                      "memory":"128Mi"
                  }
              },
              "storage":{
                  "options":{
                      "memory":{
                          "max-traces":50000
                      }
                  }
              },
              "strategy":"allInOne",
              "ui":{
                  "options":{
                      "dependencies":{
                          "menuEnabled":false
                      },
                      "menu":[
                          {
                              "items":[
                                  {
                                      "label":"Documentation",
                                      "url":"https://www.jaegertracing.io/docs/latest"
                                  },
                                  {
                                      "anchorTarget":"_self",
                                      "label":"Log Out",
                                      "url":"/oauth/sign_in"
                                  }
                              ],
                              "label":"About Jaeger"
                          }
                      ]
                  }
              },
              "volumeMounts":[
                  {
                      "mountPath":"/etc/proxy/htpasswd",
                      "name":"secret-htpasswd"
                  }
              ],
              "volumes":[
                  {
                      "name":"secret-htpasswd",
                      "secret":{
                          "secretName":"htpasswd"
                      }
                  }
              ]
          }
      }
      

      After everything gets stable, this is the final state of the pods. Note that the Jaeger deployment has two containers, one being the OAuth Proxy:

      $ oc get pods -n istio-system
      NAME                                      READY   STATUS    RESTARTS   AGE
      grafana-b67df64b6-9mqmn                   2/2     Running   0          12m
      istio-citadel-79979464d-5btml             1/1     Running   0          17m
      istio-egressgateway-7d897695c4-2f6cf      1/1     Running   0          13m
      istio-galley-6bb46858c5-htlpc             1/1     Running   0          16m
      istio-ingressgateway-8465bbf788-c4tv5     1/1     Running   0          13m
      istio-pilot-5d5bdb9556-qbzvt              2/2     Running   0          14m
      istio-policy-588f4565bc-79hxf             2/2     Running   0          15m
      istio-sidecar-injector-65cd4c8c6f-6mshr   1/1     Running   0          13m
      istio-telemetry-8484556489-hpc2s          2/2     Running   0          15m
      jaeger-57776787bc-t6m7t                   2/2     Running   0          16m
      kiali-584b4c7f7c-9zxp2                    1/1     Running   0          11m
      prometheus-b8bdc6b77-8xzcd                2/2     Running   0          16m
      

      The problem is that after some time, without any intervention, the .Spec.Ingress.Security is changed to "none", which causes the operator to reconcile the Jaeger deployment to remove the OAuth Proxy:

      $ kubectl get pods -n istio-system
      NAME                                      READY   STATUS    RESTARTS   AGE
      grafana-b67df64b6-5b7cm                   2/2     Running   0          5h14m
      istio-citadel-79979464d-g2895             1/1     Running   0          5h23m
      istio-egressgateway-7d897695c4-8gwdh      1/1     Running   0          5h16m
      istio-galley-6bb46858c5-7z5xt             1/1     Running   0          5h20m
      istio-ingressgateway-8465bbf788-9mgtn     1/1     Running   0          5h16m
      istio-pilot-5d5bdb9556-m6dg7              2/2     Running   0          5h17m
      istio-policy-588f4565bc-hktf4             2/2     Running   0          5h19m
      istio-sidecar-injector-65cd4c8c6f-gqmw8   1/1     Running   0          5h16m
      istio-telemetry-8484556489-btkwv          2/2     Running   0          5h19m
      jaeger-6966d9545b-xxkfz                   1/1     Running   0          4h43m
      kiali-6d6f9cf658-kz4h2                    1/1     Running   0          5h11m
      prometheus-b8bdc6b77-98r4m                2/2     Running   0          5h22m
      

      Note how both the replica set ID and the pod ID changed for the Jaeger deployment, indicating that there was a change to the deployment resource. For some reason, the route isn't being updated, which causes a mismatch between the assumptions (with/without OAuth Proxy). The problem with the route will be tracked in a different JIRA.

      This task is about reproducing the problem, so that we understand where the CR change is coming from. So far, I couldn't reproduce the problem, and "Security" never changed to "none" for me.

      This is how I tried to reproduce the problem with CRC, although it would be ideal to attempt to reproduce in the same infra that rcernich1 is using.

      $ crc config set memory 16384
      $ crc config set cpus 6
      $ crc start
      
      $ crc console # install the Red Hat Service Mesh and wait for the operators to get stable
      
      $ oc login ...
      $ oc new-project istio-system
      $ oc create -n istio-system -f https://raw.githubusercontent.com/Maistra/istio-operator/maistra-1.0/deploy/examples/maistra_v1_servicemeshcontrolplane_cr_full.yaml
      

      Versions:

      $ oc version
      Client Version: openshift-clients-4.2.1-201910220950
      Server Version: 4.2.4
      Kubernetes Version: v1.14.6+dc8862c
      $ oc logs -n openshift-operators jaeger-operator-98dd965f5-hhnjb | head -n 1
      time="2019-11-14T16:09:22Z" level=info msg=Versions arch=amd64 jaeger-operator=v1.13.1.redhat8 operator-sdk=v0.8.1 os=linux version=go1.11.5
      

              kearls@redhat.com Kevin Earls (Inactive)
              jpkroehling@redhat.com Juraci Paixão Kröhling (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: