-
Epic
-
Resolution: Unresolved
-
Normal
-
None
-
None
-
None
-
Configure observability for every integration among swatch services
-
5
-
False
-
-
False
-
To Do
-
33% To Do, 0% In Progress, 67% Done
Observability helps out to see what happened with a concrete request along with all the swatch services.
A request is started when:
- a user calls our API
- a cron job
An integration is when one service communicates to another service, for example:
- an API calls to another API owned by another swatch service
- an API sends a message to a kafka topic which is then consumed by another swatch service
In all the integrations, when tracing messages, we should see the same trace ID, so when querying a trace ID using splunk, I should see the messages made by the different swatch services.
Note that most of the above integrations should be already propagating the observability trace id out of the box, but I spot some integrations where we missed it:
- when processing tally summaries, the kafka stream does not propagate the trace ID headers: SWATCH-2909
- when swatch producer aws service calls the contracts API, a new trace ID is created when it should use the one coming from the swatch producer aws.
- ...
See the following diagrams where we can find the number of workflows / integrations we should revisit: https://miro.com/app/board/uXjVLZZFmEc=/?share_link_id=537929536143
Each of these integrations should:
- Use the same trace ID
- Confirm it's covered using iqe tests
- Do we want to also support heartbeats? We could write a status endpoint where to see a semaphore and throughput rate for each of these integrations
Acceptance Criteria
- Find all the integrations where the trace ID is lost. Create a ticket for fixing each one.
- Each integration should be tested to cover that we're sending/capturing the trace ID, so we don't break this functionality in the future (when bumping dependencies, for example)