Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-7915

file descriptor leak making autofs hit max open files soft limit

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Undefined Undefined
    • None
    • None
    • autofs
    • None
    • Important
    • rhel-sst-filesystems
    • ssg_filesystems_storage_and_HA
    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • If docs needed, set a value
    • None

      Description of problem:

      I had a CentOS 8 Stream system start failing to automount with

      automount[2693]: open_fopen_r:212: failed to open file: Too many open files

      Version-Release number of selected component (if applicable):

      autofs-5.1.4-83.el8.x86_64

      How reproducible:

      Reproducible but takes months to reach limit in normal usage

      Steps to Reproduce:

      Just running autofs on department clients with decently sized maps

      Actual results:

      autofs eventually stops working with above error as I find it using
      the soft limit of open files of 20480

      Expected results:

      autofs working normally with < 100 open file descriptors

      Additional info:

      After upgrading my 100+ CentOS Stream 8 systems in September 2022 I had one system today report

      Dec 19 13:16:38 r01 automount[2693]: attempting to mount entry /autofs/space/sis_001
      Dec 19 13:16:38 r01 automount[2693]: open_fopen_r:212: failed to open file: Too many open files
      Dec 19 13:16:38 r01 automount[2693]: nsswitch_parse:172: couldn't open /etc/nsswitch.conf
      Dec 19 13:16:38 r01 automount[2693]: lookup_nss_mount: can't to read name service switch config.
      Dec 19 13:16:38 r01 automount[2693]: failed to mount /autofs/space/sis_001

      On this system I find:

      1. cat /proc/sys/fs/file-max
        158269986
      2. cat /proc/sys/fs/file-nr
        26304 0 158269986
      3. sysctl fs.file-max
        fs.file-max = 158269986
      4. systemctl show --property MainPID --value autofs
        2693
      5. grep open.files /proc/2693/limits
        Max open files 20480 262144 files
      6. \ls -l /proc/2693/fd | wc -l
        20481
      7. wc -l /etc/mtab
        44 /etc/mtab
      8. grep nfs /etc/mtab | wc -l
        9

      Surveying my other CentOS Stream 8 boxes I find them with an almost fairly even (by eye) distribution of number of fd's in /proc/*/fd for the autofs process ranging from a 0 - 20480. Most of these have been up and running since September though it does seem the higher numbers corelated with higher uptime.

      Since September on new installs I have mainly been installing Rocky 8. On these boxes and my old CentOS 7 boxes, I find all have fd's for autofs under 100.

      Restarting autofs on a CentOS Stream 8 box will immediately bring the fd's back down in the <100 range and automounts start working again.

      I note that on my Rocky 8 boxes the autofs version is slightly older than on the CentOS Stream 8 boxes, autofs-5.1.4-82.el8.x86_64. Most have been running over 30 days.

      Here is a CentOS Stream 8 box running for just 7 days:

      1. uptime
        15:19:12 up 7 days, 3:05, 2 users, load average: 0.19, 0.12, 0.05
      2. /bin/ls /proc/$(systemctl show --property MainPID --value autofs)/fd/ | wc -l
        745
      3. /bin/ls -l /proc/$(systemctl show --property MainPID --value autofs)/fd/ | grep socket | wc -l
        674
      4. /bin/ls -l /proc/$(systemctl show --property MainPID --value autofs)/fd/ | grep -v socket | wc -l
        72

      So still an abnormally large number of fd's for the autofs process. The excess of these fd's appear to always be sockets

      For example on a Rocky box with 81 days uptime I have:

      1. /bin/ls -l /proc/$(systemctl show --property MainPID --value autofs)/fd/ | grep socket | wc -l
        5
      2. /bin/ls -l /proc/$(systemctl show --property MainPID --value autofs)/fd/ | grep -v socket | wc -l
        71

      Most CentOS Stream 8 boxes are running 4.18.0-408.el8.x86_64 kernel while the Rocky boxes are running 4.18.0-372.9.1.el8.x86_64

              ikent@redhat.com Ian Kent
              raines_nmr raines@nmr.mgh.harvard.edu (Inactive)
              Ian Kent Ian Kent
              Kun Wang Kun Wang
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated:
                Resolved: