Uploaded image for project: 'Subscription Watch'
  1. Subscription Watch
  2. SWATCH-2533

Event ingestion can not rely on event_type as part of an Event unique identifier while processing events

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • 2024-08-26 - API
    • None
    • None
    • False
    • Hide

      None

      Show
      None
    • False
    • Critical

      Currently, event ingestion uses an EventKey object (not DB related) to map incoming events as they are processed (preventing duplicate incoming and helping with event conflict resolution). The key contains some fields that are not directly related to the usage represented by the event, and as such, can cause invalid amendments (negative measurement values) if multiple event sources send events for the same host instance.

      For example, multiple Prometheus event sources are setting a unique event_type per metric_id (snapshot_$PRODUCT_TAG_$METRIC_ID) when producing events. However, cost-management is hard-coding the event_type to "snapshot" when sending event messages. If cost were to start sending events with different metric_ids, each with the same event_type,  timestamp, instance_id, and org_id, some incoming events would be ignored since it takes a 'last in' approach to determine which events to process. This is based on a map keyed by EventKey which currently contains event_type. Furthermore, event conflict resolution could also result in invalid amendments (negative measurement values).

      The EventKey should only contain fields that can determine uniqueness during processing. The key should represent the (org, instance_id, timestamp) tuple that the event's measurements are targeting.

      Because an event can have multiple tags and measurements, event conflict resolution needs to be done based on a tag/measurement basis as there are many possible combinations that could potentially be sent by an event source that may have a direct impact on snapshots and host totals during tally. Because of this, events should be flattened (single tag and measurement) before attempting conflict resolution, and are persisted as such.

      For example, and event:
          Event([tag1, tag2], {cores: 2, instance-hours: 4}
      will be flattened to the following events:
          Event([tag1], {cores:2})
          Event([tag1], {instance-hours:2})
          Event([tag2], {cores:2})
          Event([tag2], {instance-hours:2})

      Done:
          * EventKey uses only org, instance_id and timestamp to identity events that apply to the same host at a given point in time.
          * EventKey usage in the metering service should be removed as it is no longer needed since duplicate event will be handled on ingestion.
          * Ingested events are flattened so that persisted EventRecords only contain a single tag and measurement.
          * Amendments are only required when the measurement value changes, OR one of hardware_type, sla, usage, billingProvider, billingAccountId changes.

      QE:

      If we can send Event messages to the event topic during an IQE test, we should include a suite of tests that test ingestion/conflict resolution using this approach, along with the current tests that use the Event API. This is important because the event consumer contains logic to track events by message index before persisting them, which the Event API does not.

       

              mstead@redhat.com Michael Stead
              mstead@redhat.com Michael Stead
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: