Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-60028

NFS client TLS sock_close hang

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • rhel-9.4.z
    • None
    • rhel-sst-filesystems
    • ssg_filesystems_storage_and_HA
    • 5
    • False
    • Hide

      None

      Show
      None
    • None
    • Red Hat Enterprise Linux
    • None
    • None
    • None
    • x86_64
    • None

      TLDR;

      Netapp is performing HA tests with ktls enabled. When moving the LIF (IP) to another Host, they encounter IO hang with the backtrace shown below. When they do the test without encryption, it works.

      Please provide the package NVR for which the bug is seen:

      • ktls-utils-0.11-1.el9_4.x86_64
      • kernel 5.14.0-427.35.1.el9_4.x86_64
      • ONTAP 9.15.1 GA

      How reproducible is this bug?:

      100%

      Steps to reproduce

          1.   Configure TLS in ONTAP and RHEL 9.4 client
          2.  mount the nfs share using NFS TLS
          3.  Start copy using Linux cp command from local directory to share point
                     cp -rf /home/test_data/* /mnt/nfs_share1/
          4.  Migrate the lifs from Node-1 to Node-2 to a different node.
          5.  Copy hangs. Needs a reboot to recover.

      Actual results

      > [16957.040410] INFO: task cp:7183 blocked for more than 122 seconds.
      > [16957.040669]       Not tainted 5.14.0-427.35.1.el9_4.x86_64 #1
      > [16957.040786] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      > [16957.040947] task:cp              state stack:0     pid:7183  ppid:7066   flags:0x00004004
      > [16957.040956] Call Trace:
      > [16957.040958]  <TASK>
      > [16957.040964]  __schedule+0x21b/0x550
      > [16957.041016]  schedule+0x2d/0x70
      > [16957.041018]  io_schedule+0x42/0x70
      > [16957.041020]  folio_wait_bit+0xe9/0x200
      > [16957.041052]  ? find_get_pages_range_tag+0x199/0x1e0
      > [16957.041056]  ? __pfx_wake_page_function+0x10/0x10
      > [16957.041059]  folio_wait_writeback+0x28/0x80
      > [16957.041068]  __filemap_fdatawait_range+0x7b/0x110
      > [16957.041079]  filemap_write_and_wait_range+0x88/0xb0
      > [16957.041093]  nfs_wb_all+0x22/0x130 [nfs]
      > [16957.041301]  nfs_file_flush+0x63/0x80 [nfs]
      > [16957.041333]  filp_close+0x2f/0x70
      > [16957.041371]  __x64_sys_close+0xd/0x50
      > [16957.041373]  do_syscall_64+0x59/0x90
      > [16957.041386]  ? syscall_exit_work+0x103/0x130
      > [16957.041413]  ? syscall_exit_to_user_mode+0x22/0x40
      > [16957.041416]  ? do_syscall_64+0x69/0x90
      > [16957.041418]  ? do_syscall_64+0x69/0x90
      > [16957.041427]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
      > [16957.041442] RIP: 0033:0x7fc936e234a4
      > [16957.041565] RSP: 002b:00007ffd2ebb91c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
      > [16957.041572] RAX: ffffffffffffffda RBX: 0000000000000010 RCX: 00007fc936e234a4
      > [16957.041574] RDX: 00007ffd2ebb9660 RSI: 0000000000000004 RDI: 0000000000000004
      > [16957.041575] RBP: 00007ffd2ebb9450 R08: 0000000000000000 R09: 0000000000000000
      > [16957.041577] R10: 00000000016bb010 R11: 0000000000000246 R12: 0000000000402950
      > [16957.041578] R13: 00007ffd2ebba180 R14: 0000000000000000 R15: 0000000000000000
      > [16957.041582]  </TASK>

        1. nfs-hang.tar.gz
          54.50 MB
        2. forRHRstRepro3.tar.gz
          85.45 MB
        3. forRHRstRepro2.tar.gz
          94.60 MB
        4. forRHRstRepro.tar.gz
          92.81 MB
        5. forRHNoTLS.tar.gz
          108.17 MB
        6. forRHDetailed.tar.gz
          70.16 MB
        7. forRH41TLSsuccess.tar.gz
          108.27 MB
        8. forRH41TLSfail.tar.gz
          110.84 MB
        9. forRH.tar.gz
          77.20 MB
        10. forNetapp.tgz
          166.60 MB
        11. dm_log_after_time.txt
          593 kB

              bcodding@redhat.com Benjamin Coddington
              nilskoenigrh Nils Koenig
              Olga Kornievskaia Olga Kornievskaia
              Yongcheng Yang Yongcheng Yang
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

                Created:
                Updated: