Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-7983

[xfstests/nfs generic/650] kernel BUG at fs/nfsd/xdr4.h:780!

    • rhel-sst-filesystems
    • ssg_filesystems_storage_and_HA
    • 3
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • If docs needed, set a value
    • None

      Description of problem:
      The latest RHEL-9 kernel crashes when running xfstests generic/650 on nfs. We have 2 reproduction so far, one is on -336.el9 and the other is on -339.el9, but unfortunately no vmcore has been saved. We have loop run generic/650 for hundreds times, but 0 of them reproduced this bug.

      log
      ```
      [51935.482019] XFS (vda3): Filesystem has been shut down due to log error (0x2).
      [51935.482020] XFS (vda3): Please unmount the filesystem and rectify the problem(s).
      [51935.482550] vda3: writeback error on inode 25320400, offset 2097152, sector 58684120
      [51935.497820] -----------[ cut here ]-----------
      [51935.497821] kernel BUG at fs/nfsd/xdr4.h:780!
      [51935.497828] invalid opcode: 0000 1 PREEMPT SMP PTI
      [51935.512241] CPU: 3 PID: 56031 Comm: nfsd Kdump: loaded Not tainted 5.14.0-339.el9.x86_64 #1
      [51935.515313] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [51935.517807] RIP: 0010:do_open_lookup+0x1e3/0x1f0 [nfsd]
      [51935.520261] smpboot: CPU 0 is now offline
      [51935.530115] Code: 5d 41 5e c3 cc cc cc cc 8b 16 48 8b bb 08 01 00 00 48 83 c6 04 89 57 58 48 83 c7 5c e8 86 a8 f3 d6 48 8b 75 00 e9 4d ff ff ff <0f> 0b b8 00 00 27 18 e9 0e ff ff ff 90 0f 1f 44 00 00 41 57 41 56
      [51935.530119] RSP: 0018:ffffb7454099bdc8 EFLAGS: 00010246
      [51935.530122] RAX: 0000000000000000 RBX: ffff88e504d12280 RCX: 0000000000000070
      [51935.530124] RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000000000
      [51935.530125] RBP: ffffb7454099bdf8 R08: ffffb7454099bcf4 R09: 0000000000000000
      [51935.530126] R10: ffff88e51a8af1e0 R11: 0000000000000000 R12: ffff88e527e1c028
      [51935.530128] R13: ffff88e503bb8000 R14: ffff88e503bb8000 R15: ffffffff99c65d40
      [51935.530130] FS: 0000000000000000(0000) GS:ffff88e53bd80000(0000) knlGS:0000000000000000
      [51935.530137] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [51935.530139] CR2: 00007f23ab5fb000 CR3: 0000000110356005 CR4: 00000000000206e0
      [51935.530141] Call Trace:
      [51935.530143] <TASK>
      [51935.530146] ? show_trace_log_lvl+0x1c4/0x2df
      [51935.530152] ? show_trace_log_lvl+0x1c4/0x2df
      [51935.530156] ? nfsd4_open+0x26f/0x4b0 [nfsd]
      [51935.530193] ? __die_body.cold+0x8/0xd
      [51935.530196] ? die+0x2b/0x50
      [51935.530200] ? do_trap+0xce/0x120
      [51935.530201] ? do_open_lookup+0x1e3/0x1f0 [nfsd]
      [51935.530236] ? do_error_trap+0x65/0x80
      [51935.530238] ? do_open_lookup+0x1e3/0x1f0 [nfsd]
      [51935.530271] ? exc_invalid_op+0x4e/0x70
      [51935.530274] ? do_open_lookup+0x1e3/0x1f0 [nfsd]
      [51935.530308] ? asm_exc_invalid_op+0x16/0x20
      [51935.530312] ? do_open_lookup+0x1e3/0x1f0 [nfsd]
      [51935.530345] ? do_open_lookup+0x169/0x1f0 [nfsd]
      [51935.530391] nfsd4_open+0x26f/0x4b0 [nfsd]
      [51935.530426] nfsd4_proc_compound+0x44e/0x700 [nfsd]
      [51935.608662] nfsd_dispatch+0x53/0x170 [nfsd]
      [51935.610697] svc_process_common+0x357/0x640 [sunrpc]
      [51935.612865] ? nfsd_svc+0x170/0x170 [nfsd]
      [51935.614767] ? nfsd_shutdown_threads+0xa0/0xa0 [nfsd]
      [51935.616983] svc_process+0x12d/0x180 [sunrpc]
      [51935.618969] nfsd+0xd5/0x190 [nfsd]
      [51935.620685] kthread+0xe0/0x100
      [51935.622334] ? kthread_complete_and_exit+0x20/0x20
      [51935.624482] ret_from_fork+0x22/0x30
      [51935.626210] </TASK>
      [51935.627577] Modules linked in: loop dm_mod tls rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache netfs rpcrdma rdma_cm iw_cm ib_cm ib_core nfsd auth_rpcgss nfs_acl lockd grace snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hda_core kvm_intel snd_hwdep qxl rfkill sunrpc snd_pcm drm_ttm_helper kvm ttm irqbypass drm_kms_helper snd_timer syscopyarea snd sysfillrect soundcore i2c_piix4 virtio_balloon pcspkr joydev sysimgblt drm fuse xfs libcrc32c ata_generic crct10dif_pclmul virtio_net crc32_pclmul ata_piix net_failover crc32c_intel virtio_console failover virtio_blk libata ghash_clmulni_intel serio_raw
      ```

      reproduction on -336.el9
      https://beaker.engineering.redhat.com/recipes/14196563
      reproduction on -339.el9
      https://beaker.engineering.redhat.com/recipes/14235733

      Version-Release number of selected component (if applicable):
      kernel-5.14.0-339.el9

      How reproducible:
      2 of hundreds runs

      Steps to Reproduce:
      1. Run xfstests generic/650 on nfs
      2.
      3.

      Actual results:
      Kernel crash

      Expected results:
      No kernel crash

      Additional info:
      N/A

              jlayton@redhat.com Jeffrey Layton (Inactive)
              rhn-support-bxue Boyang Xue
              Jeffrey Layton Jeffrey Layton (Inactive)
              Boyang Xue Boyang Xue
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: