Uploaded image for project: 'OpenShift Request For Enhancement'
  1. OpenShift Request For Enhancement
  2. RFE-5563

Add script to get inotify max user watches to "textfile" exporter plugin for node_exporter

XMLWordPrintable

    • Icon: Feature Request Feature Request
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • openshift-4.14.z, openshift-4.15.z
    • Monitoring
    • None
    • False
    • None
    • False
    • Not Selected

      1. Proposed title of this feature request

      Add script to get inotify max user watches to "textfile" exporter plugin for node_exporter

      2. What is the nature and description of the request?

      Currently there is no way to get the amount used for fs.inotify.max_user_watches

      Normally all node statistics are exposed via node-exporter but node-exporter cannot collect anything that require root privleges

      https://github.com/prometheus/node_exporter/issues/866

      There was an attempt to make a gauge counter to check wether or not the fs.inotify.max_user_watches limit was reached but it can't track anything that does not run under the nobody user. So eventually the issue was closed without being merged. 

      https://github.com/prometheus/node_exporter/pull/988

      it looks like the Prometheus team created this pull request as a sort of workaround to the "can't run as root" issue:
      https://github.com/prometheus/node_exporter/pull/1186

      The solution was to create a "textfile" exporter plugin for NodeExporter: https://github.com/prometheus/node_exporter#textfile-collector

      This actually is enabled by default in the OpenShift distribution of Prometheus - check the node exporter pod yaml, and you can see:

              - '--collector.textfile.directory=/var/node_exporter/textfile'
      However, we do not have the script for inotify max user watches. Inside my node exporter pod:

      sh-4.4$ cd /var/node_exporter/textfile/
      sh-4.4$ ls
      boots.prom  virt.prom
      sh-4.4$ cat boots.prom 

      1. HELP node_boots_total reports a single series which is the number of times this system has been booted excluding the current boot. If the value is zero, this is the first time the system has booted. The value is always non-negative.
      2. TYPE node_boots_total counter
        node_boots_total{} 1
        sh-4.4$ cat virt.prom 
      3. HELP virt_platform reports one series per detected virtualization type. If no type is detected, the type is "none".
      4. TYPE virt_platform gauge
        virt_platform{type="kvm"} 1
        virt_platform{type="openstack"} 1

      It seems possible that adding the inotify-instances script to /var/node_exporter/textfile, this could be exposed

      3. Why does the customer need this? (List the business requirements here)

      This would allow users and customers to create custom alerts for their cluster if they are in danger of hitting the max for fs.inotify.max_user_watches and need to increase it.

      4. List any affected packages or components.

      Prometheus

      Node-exporter 

       

              rh-ee-rfloren Roger Florén
              rhn-support-cruhm Courtney Ruhm
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: