Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-65524

[RHEL-10] kernel-debug system boots up over 20mins with "kmemleak=on" in aws r5ad.16xlarge

    • rhel-sst-virtualization-cloud
    • ssg_virtualization
    • None
    • False
    • Hide

      None

      Show
      None

      What were you trying to do that didn't work?

      It takes over 20 mins when tried to boot the kernel-debug system with "kmemleak=on".
      The issue disappear if remove the "kmemleak=on".

      [root@ip-10-116-2-16 ec2-user]# systemd-analyze
      Startup finished in 51.020s (kernel) + 17min 5.100s (initrd) + 3min 11.506s (userspace) = 21min 7.627s
      multi-user.target reached after 2min 41.832s in userspace.
      [root@ip-10-116-2-16 ec2-user]# cat /proc/cmdline 
      BOOT_IMAGE=(hd0,gpt3)/vmlinuz-6.11.0-26.el10.x86_64+debug root=UUID=e55eae7c-50c5-46c9-a6f8-628bf757d4da ro console=tty0 console=ttyS0,115200n8 no_timer_check net.ifnames=0 rd.blacklist=nouveau nvme_core.io_timeout=4294967295 crashkernel=1G-4G:192M,4G-64G:256M,64G-:512M kmemleak=on
      //Console log
      [ [0;32m  OK   [0m] Reached target  [0;1;39minitrd-switch-root.target [0m - Switch Root.
      
               Starting  [0;1;39minitrd-switch-root.service [0m - Switch Root...
      
      [  183.652217] rcu_tasks_wait_gp: rcu_tasks grace period number 21 (since boot) is 10132 jiffies old.
      [  213.764240] rcu_tasks_wait_gp: rcu_tasks grace period number 21 (since boot) is 40244 jiffies old.
      [  304.204207] rcu_tasks_wait_gp: rcu_tasks grace period number 21 (since boot) is 130684 jiffies old.
      [  574.988036] rcu_tasks_wait_gp: rcu_tasks grace period number 21 (since boot) is 401468 jiffies old.
      [  774.411927] INFO: rcu_tasks detected stalls on tasks:
      [  774.413601] 000000005126bdd6: .. nvcsw: 2/2 holdout: 1 idle_cpu: -1/1
      [  774.415141] task:kmemleak        state:R  running task     stack:29376 pid:599   tgid:599   ppid:2      flags:0x00004000
      [  774.418111] Call Trace:
      [  774.419558]  <TASK>
      [  774.421019]  ? srso_return_thunk+0x5/0x5f
      [  774.422501]  ? lock_acquire.part.0+0x11b/0x360
      [  774.424014]  ? srso_return_thunk+0x5/0x5f
      [  774.425521]  ? find_held_lock+0x34/0x120
      [  774.427025]  ? scan_block+0x28/0xc0
      [  774.428507]  ? srso_return_thunk+0x5/0x5f
      [  774.429932]  ? srso_return_thunk+0x5/0x5f
      [  774.431322]  ? __lock_acquired+0x22d/0x850
      [  774.432705]  ? rcu_is_watching+0x15/0xb0
      [  774.434132]  ? __pfx___lock_acquired+0x10/0x10
      [  774.435539]  ? do_raw_spin_trylock+0xb4/0x180
      [  774.436961]  ? srso_return_thunk+0x5/0x5f
      [  774.438501]  ? rcu_is_watching+0x15/0xb0
      [  774.439905]  ? scan_block+0x28/0xc0
      [  774.441245]  ? srso_return_thunk+0x5/0x5f
      [  774.442565]  ? srso_return_thunk+0x5/0x5f
      [  774.443838]  ? srso_return_thunk+0x5/0x5f
      [  774.445069]  ? scan_should_stop.part.0+0x36/0x50
      [  774.446281]  ? scan_block+0x48/0xc0
      [  774.447451]  ? scan_object+0x16d/0x180
      [  774.448569]  ? scan_gray_list+0xb3/0xf0
      [  774.449626]  ? __pfx_kmemleak_scan_thread+0x10/0x10
      [  774.450654]  ? kmemleak_scan+0x357/0xa60
      [  774.451625]  ? __pfx_kmemleak_scan_thread+0x10/0x10
      [  774.452551]  ? kmemleak_scan_thread+0x95/0xc0
      [  774.453416]  ? kthread+0x2d5/0x3a0
      [  774.454247]  ? _raw_spin_unlock_irq+0x28/0x50
      [  774.455098]  ? __pfx_kthread+0x10/0x10
      [  774.455913]  ? ret_from_fork+0x34/0x70
      [  774.456602]  ? __pfx_kthread+0x10/0x10
      [  774.457374]  ? ret_from_fork_asm+0x1a/0x30
      [  774.458224]  </TASK>
      [ 1078.134081] systemd-journald[797]: Received SIGTERM from PID 1 (systemd).
      

      Please provide the package NVR for which bug is seen:

      6.11.0-26.el10.x86_64+debug

      How reproducible:

      Steps to reproduce

      launch a ec2 r5ad.16xlarge with kernel-debug installed
      set kernel-debug as default boot index with "kmemleak=on"

      Expected results

      Can boot into system normally and collect memleak

      Actual results

      Too slow to boot up system and call trace found

              virt-maint virt-maint
              xiliang@redhat.com Frank Liang
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: