• Icon: Spike Spike
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • Console Plugin
    • None
    • False
    • False

      In order to scale and have data retention for a longer period of time but still be able to provide Top N data, the NetFlow data needs to be aggregated into various tables.  Aggregation involves identifying the key fields that make up a unique record and then collapsing the rows into one.  All of the fields are preserved.  The typical 5-tuple fields are:

      • Src Address
      • Dest Address
      • Src Port
      • Dest Port
      • Protocol

      Ideally instead of Src Port and Dest Port, this should just be Server Port (see NETOBSERV-214).  The reason is because with ephemeral ports included, it will significantly reduce the ability to aggregate since accessing the same web site will be treated as a different record.

      With aggregation, there will be loss of data details but not data accuracy.  That is, you might not know exactly what time a flow occurred but that it happened in some time period (e.g. 15-minute window).

      The proposed granularity is as follows:

      • 1-minute granularity for up to 4 hours
      • 15-minute granularity between 4 hours and 1 day
      • 1-hour granularity between 1 day and 4 days
      • 1-day granularity after 4 days

      These times can be configurable options.

              Unassigned Unassigned
              stlee@redhat.com Steven Lee
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: