Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-7131

KVM/nested traceback on AMD CPU when deploying CRC

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • rhel-9.3.0
    • qemu-kvm
    • Normal
    • sst_virtualization_hwe
    • ssg_virtualization
    • False
    • Hide

      None

      Show
      None
    • If docs needed, set a value

      Description of problem:
      I'm trying to deploy CRC (Red Hat CodeReady Containers) on Centos 9 Stream, which is failing each time ONLY on AMD CPU.
      Same image, another hypervisor base on Intel CPU all is working as expected.

      On same AMD hypervisor, but on Centos 8 stream all is working fine, without any traceback (checked 2 days ago).

      Version-Release number of selected component (if applicable):

      • Operating system: CentOS 9 Stream
      • Architecture: x86_64
      • kernel version: 5.14.0-206.el9.x86_64
      • libvirt version: 8.9.0-2.el9.x86_64
      • Hypervisor and version: is running on Ubuntu 20.04, libvirt version unknown
      • CPU: AMD EPYC 7402

      How reproducible:
      each time

      Steps to Reproduce:
      1. Update system
      2. Follow CRC setup instuction - https://crc.dev/crc/#installation_gsg
      3. Traceback will raise multiple times during `crc start` command

      Actual results:
      System traceback

      [ 1020.891646] RIP: 0033:0x7f1dcf43ec6b
      [ 1020.891826] Code: 73 01 c3 48 8b 0d b5 b1 1b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0
      ff ff 73 01 c3 48 8b 0d 85 b1 1b 00 f7 d8 64 89 01 48
      [ 1020.892749] RSP: 002b:00007f1b77ffd4a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
      [ 1020.893139] RAX: ffffffffffffffda RBX: 00007f1dc4cb1e50 RCX: 00007f1dcf43ec6b
      [ 1020.893500] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000019
      [ 1020.893848] RBP: 0000000000000000 R08: 0000000000000000 R09: 00000000000000ff
      [ 1020.894196] R10: 00007f1b6c052d80 R11: 0000000000000246 R12: 000055ff6c2babf0
      [ 1020.894569] R13: 00007f1dc4cb1ff0 R14: da3d1b18a8100500 R15: 00007f1dc4cb1e48
      [ 1020.894965] </TASK>
      [ 1020.895806] Call Trace:
      [ 1020.895948] <TASK>
      [ 1020.896060] amd_pmu_enable_all+0x44/0x60
      [ 1020.896447] __perf_install_in_context+0x16c/0x220
      [ 1020.896717] remote_function+0x47/0x50
      [ 1020.896921] generic_exec_single+0x78/0xb0
      [ 1020.897129] smp_call_function_single+0xeb/0x130
      [ 1020.897367] ? sw_perf_event_destroy+0x60/0x60
      [ 1020.897592] ? perf_lock_task_context+0xa3/0x100
      [ 1020.897824] perf_install_in_context+0xcf/0x200
      [ 1020.898064] ? ctx_resched+0xe0/0xe0
      [ 1020.898262] perf_event_create_kernel_counter+0x114/0x180
      [ 1020.898538] pmc_reprogram_counter.constprop.0+0xec/0x220 [kvm]
      [ 1020.898892] amd_pmu_set_msr+0x106/0x170 [kvm_amd]
      [ 1020.899145] ? __svm_vcpu_run+0x67/0x110 [kvm_amd]
      [ 1020.899393] ? get_gp_pmc_amd+0x129/0x200 [kvm_amd]
      [ 1020.899635] __kvm_set_msr+0x7f/0x1c0 [kvm]
      [ 1020.899889] kvm_emulate_wrmsr+0x52/0x1b0 [kvm]
      [ 1020.900153] vcpu_enter_guest+0x667/0x1010 [kvm]
      [ 1020.900428] ? __rseq_handle_notify_resume+0x32/0x50
      [ 1020.900428] ? __rseq_handle_notify_resume+0x32/0x50
      [ 1020.900673] vcpu_run+0x33/0x250 [kvm]
      [ 1020.900892] kvm_arch_vcpu_ioctl_run+0x104/0x620 [kvm]
      [ 1020.901173] kvm_vcpu_ioctl+0x271/0x670 [kvm]
      [ 1020.901422] __x64_sys_ioctl+0x8a/0xc0
      [ 1020.901612] do_syscall_64+0x5c/0x90
      [ 1020.901793] ? syscall_exit_work+0x11a/0x150
      [ 1020.902014] ? syscall_exit_to_user_mode+0x12/0x30
      [ 1020.902255] ? do_syscall_64+0x69/0x90
      [ 1020.902465] ? syscall_exit_work+0x11a/0x150
      [ 1020.902694] ? syscall_exit_to_user_mode+0x12/0x30
      [ 1020.902961] ? do_syscall_64+0x69/0x90
      [ 1020.903155] ? syscall_exit_to_user_mode+0x12/0x30
      [ 1020.903393] ? do_syscall_64+0x69/0x90
      [ 1020.903582] ? do_syscall_64+0x69/0x90
      [ 1020.903777] ? sysvec_apic_timer_interrupt+0x3c/0x90
      [ 1020.904052] entry_SYSCALL_64_after_hwframe+0x63/0xcd
      [ 1020.904325] RIP: 0033:0x7f1dcf43ec6b
      [ 1020.904514] Code: 73 01 c3 48 8b 0d b5 b1 1b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0
      ff ff 73 01 c3 48 8b 0d 85 b1 1b 00 f7 d8 64 89 01 48
      [ 1020.905445] RSP: 002b:00007f1b77ffd4a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
      [ 1020.905841] RAX: ffffffffffffffda RBX: 00007f1dc4cb1e50 RCX: 00007f1dcf43ec6b
      [ 1020.906212] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000019
      [ 1020.906564] RBP: 0000000000000000 R08: 0000000000000000 R09: 00000000000000ff
      [ 1020.906924] R10: 00007f1b6c052d80 R11: 0000000000000246 R12: 000055ff6c2babf0
      [ 1020.907258] R13: 00007f1dc4cb1ff0 R14: da3d1b18a8100500 R15: 00007f1dc4cb1e48
      [ 1020.907606] </TASK>
      [ 1020.907766] Call Trace:
      [ 1020.907900] <TASK>
      [ 1020.908007] x86_pmu_stop+0x50/0xb0
      [ 1020.908183] x86_pmu_del+0x73/0x190
      [ 1020.908363] event_sched_out.part.0+0x7a/0x1f0
      [ 1020.908591] group_sched_out.part.0+0x93/0xf0
      [ 1020.908804] ctx_sched_out+0x124/0x2a0
      [ 1020.908993] perf_event_context_sched_out+0x1a5/0x460
      [ 1020.909249] __perf_event_task_sched_out+0x50/0x170
      [ 1020.909497] ? pick_next_task+0x51/0x940
      [ 1020.909698] prepare_task_switch+0xbd/0x2a0
      [ 1020.909915] __schedule+0x1cb/0x620
      [ 1020.910093] schedule+0x5a/0xc0
      [ 1020.910247] xfer_to_guest_mode_handle_work+0xac/0xe0
      [ 1020.910487] vcpu_run+0x1f5/0x250 [kvm]
      [ 1020.910703] kvm_arch_vcpu_ioctl_run+0x104/0x620 [kvm]
      [ 1020.910991] kvm_vcpu_ioctl+0x271/0x670 [kvm]
      [ 1020.911230] __x64_sys_ioctl+0x8a/0xc0
      [ 1020.911413] do_syscall_64+0x5c/0x90

      Expected results:
      Instance would work normally.

      Additional info:
      Libvirt instance XML:
      <domain type='kvm' id='1'>
      <name>crc</name>
      <uuid>deb99e6c-6949-44ab-8776-79940a88a23c</uuid>
      <memory unit='KiB'>9437184</memory>
      <currentMemory unit='KiB'>9437184</currentMemory>
      <memoryBacking>
      <source type='memfd'/>
      <access mode='shared'/>
      </memoryBacking>
      <vcpu placement='static'>4</vcpu>
      <resource>
      <partition>/machine</partition>
      </resource>
      <os>
      <type arch='x86_64' machine='pc-q35-rhel9.0.0'>hvm</type>
      <boot dev='hd'/>
      <bootmenu enable='no'/>
      </os>
      <features>
      <acpi/>
      <apic/>
      <pae/>
      </features>
      <cpu mode='host-passthrough' check='none' migratable='on'>
      <feature policy='disable' name='rdrand'/>
      </cpu>
      <clock offset='utc'/>
      <on_poweroff>destroy</on_poweroff>
      <on_reboot>restart</on_reboot>
      <on_crash>destroy</on_crash>
      <devices>
      <emulator>/usr/libexec/qemu-kvm</emulator>
      <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/home/cloud-user/.crc/machines/crc/crc.qcow2' index='1'/>
      <backingStore type='file' index='2'>
      <format type='qcow2'/>
      <source file='/home/cloud-user/.crc/cache/crc_libvirt_4.11.13_amd64/crc.qcow2'/>
      <backingStore/>
      </backingStore>
      <target dev='vda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
      </disk>
      <controller type='usb' index='0' model='qemu-xhci'>
      <alias name='usb'/>
      <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
      </controller>
      <controller type='sata' index='0'>
      <alias name='ide'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
      </controller>
      <controller type='pci' index='0' model='pcie-root'>
      <alias name='pcie.0'/>
      </controller>
      <controller type='pci' index='1' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='1' port='0x10'/>
      <alias name='pci.1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0' multifunction='on'/>
      </controller>
      <controller type='pci' index='2' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='2' port='0x11'/>
      <alias name='pci.2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x1'/>
      </controller>
      <controller type='pci' index='3' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='3' port='0x12'/>
      <alias name='pci.3'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x2'/>
      </controller>
      <controller type='pci' index='4' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='4' port='0x13'/>
      <alias name='pci.4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x3'/>
      </controller>
      <controller type='pci' index='5' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='5' port='0x14'/>
      <alias name='pci.5'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x4'/>
      </controller>
      <controller type='pci' index='6' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='6' port='0x15'/>
      <alias name='pci.6'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x5'/>
      </controller>
      <filesystem type='mount' accessmode='passthrough'>
      <driver type='virtiofs'/>
      <binary path='/usr/libexec/virtiofsd'/>
      <source dir='/home/cloud-user'/>
      <target dir='dir0'/>
      <alias name='fs0'/>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
      </filesystem>
      <interface type='network'>
      <mac address='52:fd:fc:07:21:82'/>
      <source network='crc' portid='a6e692ab-3e8f-40fe-8f23-7ef86295de2b' bridge='crc'/>
      <target dev='vnet0'/>
      <model type='virtio'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
      </interface>
      <serial type='stdio'>
      <target type='isa-serial' port='0'>
      <model name='isa-serial'/>
      </target>
      <alias name='serial0'/>
      </serial>
      <console type='stdio'>
      <target type='serial' port='0'/>
      <alias name='serial0'/>
      </console>
      <input type='mouse' bus='ps2'>
      <alias name='input0'/>
      </input>
      <input type='keyboard' bus='ps2'>
      <alias name='input1'/>
      </input>
      <graphics type='vnc' port='5900' autoport='yes' listen='127.0.0.1'>
      <listen type='address' address='127.0.0.1'/>
      </graphics>
      <audio id='1' type='none'/>
      <video>
      <model type='cirrus' vram='16384' heads='1' primary='yes'/>
      <alias name='video0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/>
      </video>
      <memballoon model='none'/>
      <rng model='virtio'>
      <backend model='random'>/dev/urandom</backend>
      <alias name='rng0'/>
      <address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
      </rng>
      </devices>
      <seclabel type='dynamic' model='selinux' relabel='yes'>
      <label>system_u:system_r:svirt_t:s0:c284,c514</label>
      <imagelabel>system_u:object_r:svirt_image_t:s0:c284,c514</imagelabel>
      </seclabel>
      <seclabel type='dynamic' model='dac' relabel='yes'>
      <label>+107:+107</label>
      <imagelabel>+107:+107</imagelabel>
      </seclabel>
      </domain>

      Whole dmesg log in the attachment.

            bdas@redhat.com Bandan Das
            rhn-engineering-dpawlik Daniel Pawlik
            Bandan Das Bandan Das
            Yanbin Duan Yanbin Duan
            Votes:
            0 Vote for this issue
            Watchers:
            18 Start watching this issue

              Created:
              Updated: