Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-31515

nfsd: crash while testing a lock on a re-exported nfs v3 mount simultaneously from local system and client

    • kernel-4.18.0-553.20.1.el8_10
    • None
    • Moderate
    • TestCaseProvided
    • sst_filesystems
    • ssg_filesystems_storage_and_HA
    • 2
    • False
    • Hide

      None

      Show
      None
    • None
    • Red Hat Enterprise Linux
    • None
    • None

      What were you trying to do that didn't work?

      While locking and testing locks on a re-exported nfs v3 mount simultaneously from both the client-server and a client which has mounted the re-exported filesystem, nfsd can crash the system in nlmclnt_setlockargs while servicing a LOCKT, due to a null file_lock->fl_file

      Please provide the package NVR for which bug is seen:

      kernel-4.18.0-513.5.1.el8_9.x86_64

      How reproducible:

      easy, see reproducer

      Steps to reproduce

      reproducer requires 3 systems and attached test program

       system 1 (nfs server):
        # mkdir /exports
        # touch /exports/testfile
        /etc/exports:
          /exports *(rw,no_root_squash)
        # exportfs -av
      
      system 2 (nfs client + server):
        # mkdir /exports
        # mount system1:/exports /exports -overs=3
        /etc/exports:
          /exports *(rw,no_root_squash,fsid=50)
        # exportfs -av
        copy test_lock_crash.c to /tmp
        # gcc /tmp/test_lock_crash.c -o /tmp/test_lock_crash
        # /tmp/test_lock_crash /exports/testfile
      
      system 3 (nfs client):
        # mkdir /mnt
        # mount system2:/exports /mnt -overs=4
        copy test_lock_crash.c to /tmp
        # gcc /tmp/test_lock_crash.c -o /tmp/test_lock_crash  # /tmp/test_lock_crash /mnt/testfile
      

       

      Expected results

      no crash

      Actual results

      nfsd process crashes while dereferencing null pointer:

      [70489.133762] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
      [70489.133899] PGD 0 P4D 0 
      [70489.133935] Oops: 0000 [#1] SMP NOPTI
      [70489.133982] CPU: 10 PID: 49117 Comm: nfsd Kdump: loaded Not tainted 4.18.0-513.18.1.el8_9.x86_64 #1
      [70489.134077] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
      [70489.134185] RIP: 0010:nlmclnt_setlockargs+0x3a/0xf0 [lockd]
        

      The crash occurs in nlmclnt_setlockargs on the following instruction:

       0xffffffffc07590ba <nlmclnt_setlockargs+0x3a>:  mov    0x20(%rax),%rdx
      
          RAX: 0000000000000000  RBX: ffff947ae5cc7c00  RCX: ffff947ae5cc7c44
       PID: 49117    TASK: ffff947b17a44000  CPU: 10   COMMAND: "nfsd"
          [exception RIP: nlmclnt_setlockargs+0x3a]
          RIP: ffffffffc07590ba  RSP: ffffa052045efd38  RFLAGS: 00010286
      ...
       #8 [ffffa052045efd50] nlmclnt_proc at ffffffffc075935a [lockd]
       #9 [ffffa052045efda8] nfsd4_lockt at ffffffffc07a7443 [nfsd]
      #10 [ffffa052045efdf8] nfsd4_proc_compound at ffffffffc07936f1 [nfsd]
      #11 [ffffa052045efe58] nfsd_dispatch at ffffffffc077ecee [nfsd]
      #12 [ffffa052045efe80] svc_process_common at ffffffffc06b4320 [sunrpc]
      #13 [ffffa052045efed8] svc_process at ffffffffc06b4637 [sunrpc]
      #14 [ffffa052045efef0] nfsd at ffffffffc077e663 [nfsd]
      #15 [ffffa052045eff10] kthread at ffffffff8491eb44

      crashing instruction is on line 132 of fs/lockd/clntproc.c:

      125 static void nlmclnt_setlockargs(struct nlm_rqst *req, struct file_lock *fl)
      126 {
      127         struct nlm_args *argp = &req->a_args;
      128         struct nlm_lock *lock = &argp->lock;
      129         char *nodename = req->a_host->h_rpcclnt->cl_nodename;
      130 
      131         nlmclnt_next_cookie(&argp->cookie);
      132         memcpy(&lock->fh, NFS_FH(locks_inode(fl->fl_file)), sizeof(struct nfs_fh));

      the file_lock (fl) passed into the function is still in %rbp

           RBP: ffff947a8afa1bd8   R8: 0000000000000000   R9: 0000000000000000
      
      crash> file_lock.fl_file ffff947a8afa1bd8
        fl_file = 0x0,

      The 'struct file' was then dereferenced as 'file->f_inode' inside locks_inode()

       

            bcodding@redhat.com Benjamin Coddington
            rhn-support-fsorenso Frank Sorenson
            NFS Team NFS Team
            Yongcheng Yang Yongcheng Yang
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved: