Details
-
Task
-
Resolution: Done
-
Major
-
None
-
None
-
None
Description
We need to put in place monitoring for the inventory so that when there are issues we can quickly determine the cause and address them.
As part of this I think we need:
- a /health endpoint to give information about the current health of the inventory
- metrics. Things like cpu/memory/remaining disk space, as well as performance metrics like average request times, number of agents being monitored, etc
- useful debug logging (and easy way to enable it)