-
Feature Request
-
Resolution: Unresolved
-
Normal
-
None
-
openshift-4.14.z, openshift-4.15.z
-
None
-
False
-
None
-
False
-
Not Selected
-
-
-
1. Proposed title of this feature request
Add script to get inotify max user watches to "textfile" exporter plugin for node_exporter
2. What is the nature and description of the request?
Currently there is no way to get the amount used for fs.inotify.max_user_watches
Normally all node statistics are exposed via node-exporter but node-exporter cannot collect anything that require root privleges
https://github.com/prometheus/node_exporter/issues/866
There was an attempt to make a gauge counter to check wether or not the fs.inotify.max_user_watches limit was reached but it can't track anything that does not run under the nobody user. So eventually the issue was closed without being merged.
https://github.com/prometheus/node_exporter/pull/988
it looks like the Prometheus team created this pull request as a sort of workaround to the "can't run as root" issue:
https://github.com/prometheus/node_exporter/pull/1186
The solution was to create a "textfile" exporter plugin for NodeExporter: https://github.com/prometheus/node_exporter#textfile-collector
This actually is enabled by default in the OpenShift distribution of Prometheus - check the node exporter pod yaml, and you can see:
- '--collector.textfile.directory=/var/node_exporter/textfile'
However, we do not have the script for inotify max user watches. Inside my node exporter pod:
sh-4.4$ cd /var/node_exporter/textfile/
sh-4.4$ ls
boots.prom virt.prom
sh-4.4$ cat boots.prom
- HELP node_boots_total reports a single series which is the number of times this system has been booted excluding the current boot. If the value is zero, this is the first time the system has booted. The value is always non-negative.
- TYPE node_boots_total counter
node_boots_total{} 1
sh-4.4$ cat virt.prom - HELP virt_platform reports one series per detected virtualization type. If no type is detected, the type is "none".
- TYPE virt_platform gauge
virt_platform{type="kvm"} 1
virt_platform{type="openstack"} 1
It seems possible that adding the inotify-instances script to /var/node_exporter/textfile, this could be exposed
3. Why does the customer need this? (List the business requirements here)
This would allow users and customers to create custom alerts for their cluster if they are in danger of hitting the max for fs.inotify.max_user_watches and need to increase it.
4. List any affected packages or components.
Prometheus
Node-exporter