-
Task
-
Resolution: Done
-
Major
-
None
-
None
-
None
We need to put in place monitoring for the inventory so that when there are issues we can quickly determine the cause and address them.
As part of this I think we need:
- a /health endpoint to give information about the current health of the inventory
- metrics. Things like cpu/memory/remaining disk space, as well as performance metrics like average request times, number of agents being monitored, etc
- useful debug logging (and easy way to enable it)