-
Story
-
Resolution: Won't Do
-
Critical
-
None
Prometheus query for UI:
sum by (instance)(((node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) + (node_memory_SwapTotal_bytes - node_memory_SwapFree_bytes)) / node_memory_MemTotal_bytes) *100
In human words: This is approximating how much over-committment of memory is taking place. A value of 100 means RAM+SWAP usage are 100% of system RAM capacity. 105% means RAM+SWAP are factor 105% of system RAM capacity.
Threshold: Yellow 95%, Red 105%
Based on: https://docs.google.com/document/d/1AbR1LACNMRU2QMqFpe-Se2mCEFLMqW_M9OPKh2v3yYw,
https://docs.google.com/document/d/1E1joajwxQChQiDVTsr9Qk_iIhpQkSI-VQP-o_BMx8Aw