Uploaded image for project: 'Cluster Observability Operator'
  1. Cluster Observability Operator
  2. COO-1250

Korrel8r can not access custom alerts

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • None
    • korrel8r
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      I deployed Cluster Observability Operator (COO) v1.2.2 on an OpenShift 4.18 cluster, and deployed a UI PlugIn for troubleshooting panel.

      I can see the nodes for a platform alert:  korrel8rcli -o json neighbours -u https://korrel8r-openshift-cluster-observability-operator.apps.tsisodia-dev.51ty.p1.openshiftapps.com --query ‘alert:alert:{“alertname”:“PodDisruptionBudgetLimit”}’
      WARNING: partial result, search timed out

      {“edges”:[\{“goal”:“k8s:PodDisruptionBudget.v1.policy”,“start”:“alert:alert”}

      ,{“goal”:“metric:metric”,“start”:“alert:alert”},{“goal”:“k8s:LokiStack.v1.loki.grafana.com”,“start”:“k8s:Deployment.v1.apps”},{“goal”:“k8s:LokiStack.v1.loki.grafana.com”,“start”:“k8s:Pod.v1”},{“goal”:“k8s:LokiStack.v1.loki.grafana.com”,“start”:“k8s:PodDisruptionBudget.v1.policy”},{“goal”:“k8s:LokiStack.v1.loki.grafana.com”,“start”:“k8s:StatefulSet.v1.apps”},{“goal”:“k8s:LokiStack.v1.loki.grafana.com”,“start”:“k8s:DaemonSet.v1.apps”}],“nodes”:[{“class”:“k8s:LokiStack.v1.loki.grafana.com”,“count”:2,“queries”:[{“count”:1,“query”:“k8s:LokiStack.v1.loki.grafana.com:

      {\“namespace\“:\“observability-hub\“,\“name\“:\“logging-loki\“}

      ”},{“count”:1,“query”:“k8s:LokiStack.v1.loki.grafana.com:

      {\“namespace\“:\“openshift-logging\“,\“name\“:\“logging-loki\“}

      ”}]},{“class”:“k8s:PodDisruptionBudget.v1.policy”,“count”:2,“queries”:[{“count”:1,“query”:“k8s:PodDisruptionBudget.v1.policy:

      {\“namespace\“:\“observability-hub\“,\“name\“:\“logging-loki-ingester\“}

      ”},{“count”:1,“query”:“k8s:PodDisruptionBudget.v1.policy:

      {\“namespace\“:\“openshift-logging\“,\“name\“:\“logging-loki-ingester\“}

      ”}]},{“class”:“alert:alert”,“count”:2,“queries”:[{“count”:2,“query”:“alert:alert:

      {\“alertname\“:\“PodDisruptionBudgetLimit\“}

      ”}]},{“class”:“metric:metric”,“count”:80,“queries”:[

      {“count”:80,“query”:“metric:metric:max by (namespace, poddisruptionbudget) (kube_poddisruptionbudget_status_current_healthy \u003c kube_poddisruptionbudget_status_desired_healthy)“}

      ]}]}

      However, query for custom alert fails:
      korrel8rcli -o json neighbours -u https://korrel8r-openshift-cluster-observability-operator.apps.tsisodia-dev.51ty.p1.openshiftapps.com --query ‘alert:alert:{“alertname”:“AlertExampleContainerCrashing”}’

      {“edges”:null,“nodes”:null}

      I saw the configmap created by the troubleshooting panel has such data:
      oc get cm korrel8r -o yaml
      apiVersion: v1
      data:
        korrel8r.yaml: |
          # Default configuration for deploying Korrel8r as a service in an OpenShift cluster.
          # Store service URLs assume that stores are installed in their default locations.
          stores:
            - domain: k8s
            - domain: alert
              metrics: https://thanos-querier.openshift-monitoring.svc:9091
              alertmanager: https://alertmanager-main.openshift-monitoring.svc:9094
              certificateAuthority: ./run/secrets/kubernetes.io/serviceaccount/service-ca.crt
            - domain: log
              lokiStack: https://logging-loki-gateway-http.openshift-logging.svc:8080
              certificateAuthority: ./run/secrets/kubernetes.io/serviceaccount/service-ca.crt
            - domain: metric
              metric: https://thanos-querier.openshift-monitoring.svc:9091
              certificateAuthority: ./run/secrets/kubernetes.io/serviceaccount/service-ca.crt
            - domain: netflow
              lokiStack: https://loki-gateway-http.netobserv.svc:8080
              certificateAuthority: ./run/secrets/kubernetes.io/serviceaccount/service-ca.crt
            - domain: trace
              tempoStack: https://tempo-platform-gateway.openshift-tracing.svc.cluster.local:8080/api/traces/v1/platform/tempo/api/search
              certificateAuthority: ./run/secrets/kubernetes.io/serviceaccount/service-ca.crt
          include:
            - /etc/korrel8r/rules/all.yaml
      kind: ConfigMap

      Sounds like it does not have access to openshift-user-workload-monitoring.
      Now I can see the nodes for a platform alert:  korrel8rcli o json neighbours -u https://korrel8r-openshift-cluster-observability-operator.apps.tsisodia-dev.51ty.p1.openshiftapps.com -query ‘alert!https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/2757@2x.png!

      {“alertname”:“PodDisruptionBudgetLimit”}


      WARNING: partial result, search timed out

      {“edges”:[\{“goal”:“k8s:PodDisruptionBudget.v1.policy”,“start”:“alert:alert”}

      ,{“goal”:“metric:metric”,“start”:“alert:alert”},{“goal”:“k8s:LokiStack.v1.loki.grafana.com”,“start”:“k8s:Deployment.v1.apps”},{“goal”:“k8s:LokiStack.v1.loki.grafana.com”,“start”:“k8s:Pod.v1”},{“goal”:“k8s:LokiStack.v1.loki.grafana.com”,“start”:“k8s:PodDisruptionBudget.v1.policy”},{“goal”:“k8s:LokiStack.v1.loki.grafana.com”,“start”:“k8s:StatefulSet.v1.apps”},{“goal”:“k8s:LokiStack.v1.loki.grafana.com”,“start”:“k8s:DaemonSet.v1.apps”}],“nodes”:[{“class”:“k8s:LokiStack.v1.loki.grafana.com”,“count”:2,“queries”:[{“count”:1,“query”:“k8s:LokiStack.v1.loki.grafana.com:

      {\“namespace\“:\“observability-hub\“,\“name\“:\“logging-loki\“}

      ”},{“count”:1,“query”:“k8s:LokiStack.v1.loki.grafana.com:

      {\“namespace\“:\“openshift-logging\“,\“name\“:\“logging-loki\“}

      ”}]},{“class”:“k8s:PodDisruptionBudget.v1.policy”,“count”:2,“queries”:[{“count”:1,“query”:“k8s:PodDisruptionBudget.v1.policy:

      {\“namespace\“:\“observability-hub\“,\“name\“:\“logging-loki-ingester\“}

      ”},{“count”:1,“query”:“k8s:PodDisruptionBudget.v1.policy:

      {\“namespace\“:\“openshift-logging\“,\“name\“:\“logging-loki-ingester\“}

      ”}]},{“class”:“alert:alert”,“count”:2,“queries”:[{“count”:2,“query”:“alert!https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/2757@2x.png!

      {\“alertname\“:\“PodDisruptionBudgetLimit\“}

      ”}]},{“class”:“metric:metric”,“count”:80,“queries”:[

      {“count”:80,“query”:“metric:metric:max by (namespace, poddisruptionbudget) (kube_poddisruptionbudget_status_current_healthy \u003c kube_poddisruptionbudget_status_desired_healthy)“}

      ]}]}
      3:14
      However, query for custom alert fails:
      3:14
      korrel8rcli o json neighbours -u https://korrel8r-openshift-cluster-observability-operator.apps.tsisodia-dev.51ty.p1.openshiftapps.com -query ‘alert!https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/2757@2x.png!

      {“alertname”:“AlertExampleContainerCrashing”}

      {“edges”:null,“nodes”:null}

       

              rhn-engineering-aconway Alan Conway
              jianrzha@redhat.com Jianrong Zhang
              None
              None
              None
              None
              None
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: