Uploaded image for project: 'Network Observability'
  1. Network Observability
  2. NETOBSERV-1895

Must gather metrics - OCP 4.18

    • Icon: Story Story
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • None
    • None
    • False
    • None
    • False

      Add metrics to our own netobserv must gather

      Also investigate about documenting how to use netobserv generated metrics to help customer cases debug the cluster

      Collect netobserv_ metrics

      $ oc adm must-gather -- gather_metrics --min-time=$(date --date='2 hours ago' +%s%3N) --match="{__name__=~\'netobserv_.*\'}" 

       

      Download prometheus from https://prometheus.io/download/ to get both promtool & prometheus server

       

      Convert metrics to prom_data using promtool

      $ promtool tsdb create-blocks-from openmetrics ./must-gather.local.4581356154815929587/quay-io-openshift-release-dev-ocp-v4-0-art-dev-sha256-083682ad8b215e737049e9561c9d61e670cb3b8979a64fbc1b5298b588a5596f/monitoring/metrics/metrics.openmetrics prom_data/ 

      Run a local prometheus instance using generated data to query the metrics

      $ ./prometheus --storage.tsdb.path="prom_data/" 

       

      Open http://localhost:9090/graph?g0.expr=topk(7%2C%20sum(rate(netobserv_node_egress_bytes_total%7B%7D%5B2m%5D))%20by%20(SrcK8S_HostName%2CDstK8S_HostName))&g0.tab=0&g0.display_mode=lines&g0.show_exemplars=0&g0.range_input=1h&g1.expr=topk(7%2C%20sum(rate(netobserv_node_ingress_bytes_total%7B%7D%5B2m%5D))%20by%20(SrcK8S_HostName%2CDstK8S_HostName))&g1.tab=0&g1.display_mode=lines&g1.show_exemplars=0&g1.range_input=1h&g2.expr=topk(7%2C%20(sum(rate(netobserv_workload_egress_bytes_total%7BK8S_FlowLayer%3D%22infra%22%2CSrcK8S_Namespace!%3D%22%22%7D%5B2m%5D))%20by%20(SrcK8S_Namespace%2CDstK8S_Namespace))%20or%20(sum(rate(netobserv_workload_egress_bytes_total%7BK8S_FlowLayer%3D%22infra%22%2CDstK8S_Namespace!%3D%22%22%7D%5B2m%5D))%20by%20(SrcK8S_Namespace%2CDstK8S_Namespace)))&g2.tab=0&g2.display_mode=lines&g2.show_exemplars=0&g2.range_input=1h&g3.expr=topk(7%2C%20(sum(rate(netobserv_workload_egress_bytes_total%7BK8S_FlowLayer%3D%22app%22%2CSrcK8S_Namespace!%3D%22%22%7D%5B2m%5D))%20by%20(SrcK8S_Namespace%2CDstK8S_Namespace))%20or%20(sum(rate(netobserv_workload_egress_bytes_total%7BK8S_FlowLayer%3D%22app%22%2CDstK8S_Namespace!%3D%22%22%7D%5B2m%5D))%20by%20(SrcK8S_Namespace%2CDstK8S_Namespace)))&g3.tab=0&g3.display_mode=lines&g3.show_exemplars=0&g3.range_input=1h&g4.expr=topk(7%2C%20(sum(rate(netobserv_workload_egress_bytes_total%7BK8S_FlowLayer%3D%22infra%22%2CSrcK8S_Namespace!%3D%22%22%7D%5B2m%5D))%20by%20(SrcK8S_Namespace%2CSrcK8S_OwnerName%2CDstK8S_Namespace%2CDstK8S_OwnerName))%20or%20(sum(rate(netobserv_workload_egress_bytes_total%7BK8S_FlowLayer%3D%22infra%22%2CDstK8S_Namespace!%3D%22%22%7D%5B2m%5D))%20by%20(SrcK8S_Namespace%2CSrcK8S_OwnerName%2CDstK8S_Namespace%2CDstK8S_OwnerName)))&g4.tab=0&g4.display_mode=lines&g4.show_exemplars=0&g4.range_input=1h&g5.expr=topk(7%2C%20(sum(rate(netobserv_workload_egress_bytes_total%7BK8S_FlowLayer%3D%22app%22%2CSrcK8S_Namespace!%3D%22%22%7D%5B2m%5D))%20by%20(SrcK8S_Namespace%2CSrcK8S_OwnerName%2CDstK8S_Namespace%2CDstK8S_OwnerName))%20or%20(sum(rate(netobserv_workload_egress_bytes_total%7BK8S_FlowLayer%3D%22app%22%2CDstK8S_Namespace!%3D%22%22%7D%5B2m%5D))%20by%20(SrcK8S_Namespace%2CSrcK8S_OwnerName%2CDstK8S_Namespace%2CDstK8S_OwnerName)))&g5.tab=0&g5.display_mode=lines&g5.show_exemplars=0&g5.range_input=1h&g6.expr=topk(7%2C%20(sum(rate(netobserv_workload_ingress_bytes_total%7BK8S_FlowLayer%3D%22infra%22%2CSrcK8S_Namespace!%3D%22%22%7D%5B2m%5D))%20by%20(SrcK8S_Namespace%2CDstK8S_Namespace))%20or%20(sum(rate(netobserv_workload_ingress_bytes_total%7BK8S_FlowLayer%3D%22infra%22%2CDstK8S_Namespace!%3D%22%22%7D%5B2m%5D))%20by%20(SrcK8S_Namespace%2CDstK8S_Namespace)))&g6.tab=0&g6.display_mode=lines&g6.show_exemplars=0&g6.range_input=1h&g7.expr=topk(7%2C%20(sum(rate(netobserv_workload_ingress_bytes_total%7BK8S_FlowLayer%3D%22app%22%2CSrcK8S_Namespace!%3D%22%22%7D%5B2m%5D))%20by%20(SrcK8S_Namespace%2CDstK8S_Namespace))%20or%20(sum(rate(netobserv_workload_ingress_bytes_total%7BK8S_FlowLayer%3D%22app%22%2CDstK8S_Namespace!%3D%22%22%7D%5B2m%5D))%20by%20(SrcK8S_Namespace%2CDstK8S_Namespace)))&g7.tab=0&g7.display_mode=lines&g7.show_exemplars=0&g7.range_input=1h&g8.expr=topk(7%2C%20(sum(rate(netobserv_workload_ingress_bytes_total%7BK8S_FlowLayer%3D%22infra%22%2CSrcK8S_Namespace!%3D%22%22%7D%5B2m%5D))%20by%20(SrcK8S_Namespace%2CSrcK8S_OwnerName%2CDstK8S_Namespace%2CDstK8S_OwnerName))%20or%20(sum(rate(netobserv_workload_ingress_bytes_total%7BK8S_FlowLayer%3D%22infra%22%2CDstK8S_Namespace!%3D%22%22%7D%5B2m%5D))%20by%20(SrcK8S_Namespace%2CSrcK8S_OwnerName%2CDstK8S_Namespace%2CDstK8S_OwnerName)))&g8.tab=0&g8.display_mode=lines&g8.show_exemplars=0&g8.range_input=1h&g9.expr=topk(7%2C%20(sum(rate(netobserv_workload_ingress_bytes_total%7BK8S_FlowLayer%3D%22app%22%2CSrcK8S_Namespace!%3D%22%22%7D%5B2m%5D))%20by%20(SrcK8S_Namespace%2CSrcK8S_OwnerName%2CDstK8S_Namespace%2CDstK8S_OwnerName))%20or%20(sum(rate(netobserv_workload_ingress_bytes_total%7BK8S_FlowLayer%3D%22app%22%2CDstK8S_Namespace!%3D%22%22%7D%5B2m%5D))%20by%20(SrcK8S_Namespace%2CSrcK8S_OwnerName%2CDstK8S_Namespace%2CDstK8S_OwnerName)))&g9.tab=0&g9.display_mode=lines&g9.show_exemplars=0&g9.range_input=1h 

       

              Unassigned Unassigned
              jpinsonn@redhat.com Julien Pinsonneau
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: