Uploaded image for project: 'Network Observability'
  1. Network Observability
  2. NETOBSERV-1611

Not able to get top N pods generating network traffic when requesting long time

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Normal Normal
    • None
    • netobserv-1.5
    • Loki
    • False
    • None
    • False
    • NetObserv - Sprint 253, NetObserv - Sprint 254, NetObserv - Sprint 255, NetObserv - Sprint 256, NetObserv - Sprint 257
    • Moderate

      Description of problem:

      When trying to get the top N pods generating more traffic from Openshift Console > Observe > Networ traffic > Overview, this is not feasible as giving timeout when using as backend storage loki

      Steps to Reproduce:

      1. Generate network traffic during hours/days using https://github.com/jotak/hey-ho
      2. Change flowcollector.spec.loki.readTimeout to be 3 minutes 
      3. Delete all the filters excluding only the network traffic exposed by the netobserv namespace (Not destination or Not source Namespace netobserv)  
      3. Trying to get the top N pods generating more network traffic
      

      Actual results:

      It fails to get the data of the top N pods generating more network traffic and a certificate error is returned in the OCP Console where it's visible:
      
      ```
      $ oc logs loki-query-frontend-598df67449-4fn4k |grep -i cert
      2024-04-08T20:14:48.772572081Z 2024/04/08 20:14:48 http: TLS handshake error from 10.251.141.45:41716: remote error: tls: bad certificate
      2024-04-08T20:22:05.082471437Z 2024/04/08 20:22:05 http: TLS handshake error from 10.251.141.45:54802: remote error: tls: bad certificate
      2024-04-08T20:25:33.457956083Z 2024/04/08 20:25:33 http: TLS handshake error from 10.251.141.45:52494: remote error: tls: bad certificate
      ```

      Expected results:

      It successes.
      NOTES:

      Two use cases are required from the network observability operator:

      1. It's to analyze the traffic between pods, between pods and the external world, namespaces, etc where filters are set and the query is so much specific
      2. It's trying to know what it's generating more traffic in 1,2,3 or 4 weeks. This second use case is important in a cloud environment because not filters must be present as it's desired to know in a long period of time, what's generating more traffic because this has an impact in money as it can be in a cloud environment.

              rhn-support-sarthoma Sara Thomas
              rhn-support-ocasalsa Oscar Casal Sanchez
              Mehul Modi Mehul Modi
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: