• Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Undefined Undefined
    • None
    • rhel-10.0
    • crash
    • None
    • None
    • rhel-sst-kernel-debug
    • ssg_core_kernel
    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • x86_64
    • None

      What were you trying to do that didn't work?

      On multiple storage QE systems, the anaylze-crash-cmd task is failing after triggering a panic. 

       

      [  15:05:59  ] :: [  LOG  ] :: WARNING MESSAGES BEGIN
      WARNING: possibly bogus exception frame
      WARNING: possibly bogus exception frame
      WARNING: possibly bogus exception frame
      [  15:05:59  ] :: [  LOG  ] :: WARNING MESSAGES END
      [  15:05:59  ] :: [  WARN ] :: Crash commands reported warnings.
      [  15:05:59  ] :: [  LOG  ] :: ERROR MESSAGES BEGIN
          [exception RIP: unknown or invalid address]
          [exception RIP: unknown or invalid address]
          [exception RIP: unknown or invalid address]

      Below is the job where the problem was found:

       

      https://beaker.engineering.redhat.com/jobs/9396488

      I will provide the core file shortly.

      Please provide the package NVR for which bug is seen:

      RHEL-10.0-20240603.86

      kernel-6.9.0-7.el10

      kexec-tools-2.0.28-9.el10

      How reproducible: Often

      Steps to reproduce

      1. trigger panic to get a vmcore file
      2. run anaylze-crash-cmd task 

            [RHEL-40463] RHEL-10: crash - exception RIP: unknown or invalid address

            This is a duplicated issue, see: https://issues.redhat.com/browse/RHEL-36156

            I did the test with the latest upstream crash, the current issue has been fixed:

            48764a14bc58 ("x86_64: fix for adding top_of_kernel_stack_padding for kernel stack")

            $ ./crash /home/lijiang/src/xy/usr/lib/debug/lib/modules/6.9.0-7.el10.x86_64/vmlinux /home/lijiang/src/xy/vmcore

            crash 8.0.5++
            Copyright (C) 2002-2024 Red Hat, Inc.
            Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
            Copyright (C) 1999-2006 Hewlett-Packard Co
            Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
            Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
            Copyright (C) 2005, 2011, 2020-2024 NEC Corporation
            Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
            Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
            Copyright (C) 2015, 2021 VMware, Inc.
            This program is free software, covered by the GNU General Public License,
            and you are welcome to change it and/or distribute copies of it under
            certain conditions. Enter "help copying" to see the conditions.
            This program has absolutely no warranty. Enter "help warranty" for details.

            <font color="#A347BA"><b>GNU gdb (GDB) 10.2</b></font>
            Copyright (C) 2021 Free Software Foundation, Inc.
            License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html&gt;
            This is free software: you are free to change and redistribute it.
            There is NO WARRANTY, to the extent permitted by law.
            Type "show copying" and "show warranty" for details.
            This GDB was configured as "x86_64-pc-linux-gnu".
            Type "show configuration" for configuration details.
            Find the GDB manual and other documentation resources online at:
            <http://www.gnu.org/software/gdb/documentation/&gt;.

            For help, type "help".
            Type "apropos word" to search for commands related to "word"...

            KERNEL: /home/lijiang/src/xy/usr/lib/debug/lib/modules/6.9.0-7.el10.x86_64/vmlinux
            DUMPFILE: /home/lijiang/src/xy/vmcore [PARTIAL DUMP]
            CPUS: 48
            DATE: Sat Jun 8 02:57:55 CST 2024
            UPTIME: 00:02:26
            LOAD AVERAGE: 1.34, 0.44, 0.16
            TASKS: 1074
            NODENAME: storageqe-38.fast.eng.rdu2.dc.redhat.com
            RELEASE: 6.9.0-7.el10.x86_64
            VERSION: #1 SMP PREEMPT_DYNAMIC Wed May 22 03:34:22 EDT 2024
            MACHINE: x86_64 (2795 Mhz)
            MEMORY: 31.6 GB
            PANIC: "Kernel panic - not syncing: sysrq triggered crash"
            PID: 6155
            COMMAND: "main.sh"
            TASK: ffff9c7614a5cd40 [THREAD_INFO: ffff9c7614a5cd40]
            CPU: 29
            STATE: TASK_RUNNING (PANIC)

            crash> foreach bt|grep bogus
            crash> bt
            PID: 6155 TASK: ffff9c7614a5cd40 CPU: 29 COMMAND: "main.sh"
            #0 [ffffb5fdcd41faa0] machine_kexec at ffffffffa608115f
            #1 [ffffb5fdcd41faf8] __crash_kexec at ffffffffa621c904
            #2 [ffffb5fdcd41fbb8] panic at ffffffffa6d65e4a
            #3 [ffffb5fdcd41fc38] sysrq_handle_crash at ffffffffa680e0ca
            #4 [ffffb5fdcd41fc40] __handle_sysrq.cold at ffffffffa6d997c8
            #5 [ffffb5fdcd41fc70] write_sysrq_trigger at ffffffffa680eca9
            #6 [ffffb5fdcd41fc98] proc_reg_write at ffffffffa653a52c
            #7 [ffffb5fdcd41fcb0] vfs_write at ffffffffa648bf08
            #8 [ffffb5fdcd41fd48] ksys_write at ffffffffa648c44d
            #9 [ffffb5fdcd41fd80] do_syscall_64 at ffffffffa6dc049e
            #10 [ffffb5fdcd41ff40] entry_SYSCALL_64_after_hwframe at ffffffffa6e0012f
            RIP: 00007f9c73478324 RSP: 00007ffd787ab588 RFLAGS: 00000202
            RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f9c73478324
            RDX: 0000000000000002 RSI: 0000563df2b13630 RDI: 0000000000000001
            RBP: 0000563df2b13630 R8: 0000000000000073 R9: 00000000ffffffff
            R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000002
            R13: 00007f9c735545c0 R14: 0000000000000002 R15: 00007f9c73551f20
            ORIG_RAX: 0000000000000001 CS: 0033 SS: 002b
            crash>

            Lianbo Jiang added a comment - This is a duplicated issue, see: https://issues.redhat.com/browse/RHEL-36156 I did the test with the latest upstream crash, the current issue has been fixed: 48764a14bc58 ("x86_64: fix for adding top_of_kernel_stack_padding for kernel stack") $ ./crash /home/lijiang/src/xy/usr/lib/debug/lib/modules/6.9.0-7.el10.x86_64/vmlinux /home/lijiang/src/xy/vmcore crash 8.0.5++ Copyright (C) 2002-2024 Red Hat, Inc. Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation Copyright (C) 1999-2006 Hewlett-Packard Co Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited Copyright (C) 2006, 2007 VA Linux Systems Japan K.K. Copyright (C) 2005, 2011, 2020-2024 NEC Corporation Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc. Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc. Copyright (C) 2015, 2021 VMware, Inc. This program is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Enter "help copying" to see the conditions. This program has absolutely no warranty. Enter "help warranty" for details. <font color="#A347BA"><b>GNU gdb (GDB) 10.2</b></font> Copyright (C) 2021 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later < http://gnu.org/licenses/gpl.html&gt ; This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-pc-linux-gnu". Type "show configuration" for configuration details. Find the GDB manual and other documentation resources online at: < http://www.gnu.org/software/gdb/documentation/&gt ;. For help, type "help". Type "apropos word" to search for commands related to "word"... KERNEL: /home/lijiang/src/xy/usr/lib/debug/lib/modules/6.9.0-7.el10.x86_64/vmlinux DUMPFILE: /home/lijiang/src/xy/vmcore [PARTIAL DUMP] CPUS: 48 DATE: Sat Jun 8 02:57:55 CST 2024 UPTIME: 00:02:26 LOAD AVERAGE: 1.34, 0.44, 0.16 TASKS: 1074 NODENAME: storageqe-38.fast.eng.rdu2.dc.redhat.com RELEASE: 6.9.0-7.el10.x86_64 VERSION: #1 SMP PREEMPT_DYNAMIC Wed May 22 03:34:22 EDT 2024 MACHINE: x86_64 (2795 Mhz) MEMORY: 31.6 GB PANIC: "Kernel panic - not syncing: sysrq triggered crash" PID: 6155 COMMAND: "main.sh" TASK: ffff9c7614a5cd40 [THREAD_INFO: ffff9c7614a5cd40] CPU: 29 STATE: TASK_RUNNING (PANIC) crash> foreach bt|grep bogus crash> bt PID: 6155 TASK: ffff9c7614a5cd40 CPU: 29 COMMAND: "main.sh" #0 [ffffb5fdcd41faa0] machine_kexec at ffffffffa608115f #1 [ffffb5fdcd41faf8] __crash_kexec at ffffffffa621c904 #2 [ffffb5fdcd41fbb8] panic at ffffffffa6d65e4a #3 [ffffb5fdcd41fc38] sysrq_handle_crash at ffffffffa680e0ca #4 [ffffb5fdcd41fc40] __handle_sysrq.cold at ffffffffa6d997c8 #5 [ffffb5fdcd41fc70] write_sysrq_trigger at ffffffffa680eca9 #6 [ffffb5fdcd41fc98] proc_reg_write at ffffffffa653a52c #7 [ffffb5fdcd41fcb0] vfs_write at ffffffffa648bf08 #8 [ffffb5fdcd41fd48] ksys_write at ffffffffa648c44d #9 [ffffb5fdcd41fd80] do_syscall_64 at ffffffffa6dc049e #10 [ffffb5fdcd41ff40] entry_SYSCALL_64_after_hwframe at ffffffffa6e0012f RIP: 00007f9c73478324 RSP: 00007ffd787ab588 RFLAGS: 00000202 RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f9c73478324 RDX: 0000000000000002 RSI: 0000563df2b13630 RDI: 0000000000000001 RBP: 0000563df2b13630 R8: 0000000000000073 R9: 00000000ffffffff R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000002 R13: 00007f9c735545c0 R14: 0000000000000002 R15: 00007f9c73551f20 ORIG_RAX: 0000000000000001 CS: 0033 SS: 002b crash>

            Another example on a different system:

            https://beaker.engineering.redhat.com/jobs/9396779

            Marco Patalano added a comment - Another example on a different system: https://beaker.engineering.redhat.com/jobs/9396779

            Marco Patalano added a comment - core file cane be found here: https://people.redhat.com/mpatalan/.vmcores/rhel-40463/

              lijiang@redhat.com Lianbo Jiang
              mpatalan Marco Patalano
              Lianbo Jiang Lianbo Jiang
              Core Kernel QE Core Kernel QE
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: