-
Bug
-
Resolution: Duplicate
-
Undefined
-
None
-
None
-
Quality / Stability / Reliability
-
0.42
-
False
-
-
False
-
None
-
-
None
Description of problem:
[vcpu hotplug] the cpu is offline after migration successfully
Version-Release number of selected component (if applicable):
CNV: registry-proxy.engineering.redhat.com/rh-osbs/iib:977162 OCP 4.19.0-ec.4
How reproducible:
100%
Steps to Reproduce:
1. Create a VM by following yaml config.
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
annotations:
kubemacpool.io/transaction-timestamp: "2025-05-30T15:05:57.204944275Z"
kubevirt.io/latest-observed-api-version: v1
kubevirt.io/storage-observed-api-version: v1
vm.kubevirt.io/validations: |
[
{
"name": "minimal-required-memory",
"path": "jsonpath::.spec.domain.memory.guest",
"rule": "integer",
"message": "This VM requires more memory.",
"min": 1610612736
}
]
creationTimestamp: "2025-05-30T09:44:26Z"
finalizers:
- kubevirt.io/virtualMachineControllerFinalize
generation: 9
labels:
app: rhel97
kubevirt.io/dynamic-credentials-support: "true"
vm.kubevirt.io/template: rhel9-server-small
vm.kubevirt.io/template.namespace: openshift
vm.kubevirt.io/template.revision: "1"
vm.kubevirt.io/template.version: v0.34.0
name: rhel97
namespace: default
resourceVersion: "53900460"
uid: 82b148ed-198b-4436-acfd-214bc9067822
spec:
dataVolumeTemplates:
- apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata:
creationTimestamp: null
name: rhel97
spec:
source:
http:
url: http://<internal_server>/libvirt-CI-resources/RHEL-9.7-x86_64-latest-ovmf.qcow2
storage:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 16Gi
storageClassName: ocs-storagecluster-cephfs
volumeMode: Filesystem
runStrategy: RerunOnFailure
template:
metadata:
annotations:
vm.kubevirt.io/flavor: small
vm.kubevirt.io/os: rhel9
vm.kubevirt.io/workload: server
creationTimestamp: null
labels:
kubevirt.io/domain: rhel97
kubevirt.io/size: small
network.kubevirt.io/headlessService: headless
spec:
architecture: amd64
domain:
cpu:
cores: 1
sockets: 1
threads: 1
devices:
disks:
- disk:
bus: virtio
name: rootdisk
- disk:
bus: virtio
name: cloudinitdisk
interfaces:
- macAddress: 02:0d:38:00:00:00
masquerade: {}
model: virtio
name: default
networkInterfaceMultiqueue: true
rng: {}
features:
acpi: {}
smm:
enabled: true
firmware:
bootloader:
efi: {}
machine:
type: pc-q35-rhel9.6.0
memory:
guest: 2Gi
resources: {}
networks:
- name: default
pod: {}
terminationGracePeriodSeconds: 180
volumes:
- dataVolume:
name: rhel97
name: rootdisk
- cloudInitNoCloud:
userData: |-
#cloud-config
user: cloud-user
password: jaom-1g5t-ohlm
chpasswd: { expire: False }
name: cloudinitdisk 2.
2. Change the socket to 2 to trigger vcpu hotplug.
3. Login the VM after successfully migration and check the cpuinfo.
The new cpu is offline.
[root@rhel97 ~]# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 46 bits physical, 57 bits virtual
Byte Order: Little Endian
CPU(s): 2
On-line CPU(s) list: 0
Off-line CPU(s) list: 1
Vendor ID: GenuineIntel
BIOS Vendor ID: Red Hat
Model name: Intel Xeon Processor (Icelake)
BIOS Model name: RHEL-9.6.0 PC (Q35 + ICH9, 2009)
CPU family: 6
Model: 134
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 1
Stepping: 0
BogoMIPS: 4190.15
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge m
ca cmov pat pse36 clflush dts mmx fxsr sse sse2 ss sys
call nx pdpe1gb rdtscp lm constant_tsc pebs bts rep_go
od nopl xtopology cpuid tsc_known_freq pni pclmulqdq d
tes64 vmx ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic mov
be popcnt tsc_deadline_timer aes xsave avx f16c rdrand
hypervisor lahf_lm abm 3dnowprefetch cpuid_fault ssbd
ibrs ibpb stibp ibrs_enhanced tpr_shadow flexpriority
ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bm
i2 erms invpcid avx512f avx512dq rdseed adx smap avx51
2ifma clflushopt clwb avx512cd sha_ni avx512bw avx512v
l xsaveopt xsavec xgetbv1 xsaves wbnoinvd arat vnmi av
x512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmul
qdq avx512_vnni avx512_bitalg avx512_vpopcntdq la57 rd
pid fsrm md_clear flush_l1d arch_capabilities
Virtualization features:
Virtualization: VT-x
Hypervisor vendor: KVM
Virtualization type: full
Caches (sum of all):
L1d: 32 KiB (1 instance)
L1i: 32 KiB (1 instance)
L2: 4 MiB (1 instance)
L3: 16 MiB (1 instance)
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0
Vulnerabilities:
Gather data sampling: Not affected
Itlb multihit: Not affected
L1tf: Not affected
Mds: Not affected
Meltdown: Not affected
Mmio stale data: Mitigation; Clear CPU buffers; SMT Host state unknown
Reg file data sampling: Not affected
Retbleed: Not affected
Spec rstack overflow: Not affected
Spec store bypass: Mitigation; Speculative Store Bypass disabled via prct
l
Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointe
r sanitization
Spectre v2: Mitigation; Enhanced / Automatic IBRS; IBPB conditiona
l; RSB filling; PBRSB-eIBRS Not affected; BHI SW loop,
KVM SW loop
Srbds: Not affected
Tsx async abort: Mitigation; TSX disabled
4. Check the dmesg log.
[ 70.601402] clocksource: timekeeping watchdog on CPU0: Marking clocksource 'tsc' as unstable because the skew is too large:
[ 70.601462] clocksource: 'kvm-clock' wd_nsec: 543445712 wd_now: 117af3fcc2 wd_last: 115a8fa9f2 mask: ffffffffffffffff
[ 70.601467] clocksource: 'tsc' cs_nsec: 479651937 cs_now: 24b63d5eb6 cs_last: 247a57b034 mask: ffffffffffffffff
[ 70.601470] clocksource: 'kvm-clock' (not 'tsc') is current clocksource.
[ 70.601473] tsc: Marking TSC unstable due to clocksource watchdog
[ 70.718258] ACPI: CPU1 has been hot-added
[ 70.728190] SMP alternatives: switching to SMP code
[ 70.733376] smpboot: Booting Node 0 Processor 1 APIC 0x1
[ 70.734082] kvm_intel: Inconsistent VMCS config on CPU 1
[ 70.734130] kvm: enabling virtualization on CPU1 failed
[ 70.735933] smpboot: CPU 1 is now offline
5. Soft reboot the VM, the 2 cpu will be online. Then do vcpu hotplug again by changing the sockets to 3.
6. Check the cpu info after migration. The new cpu can be hotpluged successfully.
[ 57.549015] ACPI: CPU2 has been hot-added
[ 57.565879] smpboot: Booting Node 0 Processor 2 APIC 0x2
[ 57.566014] TSC ADJUST compensate: CPU2 observed 130448479191 warp. Adjust: 130448479191
[ 57.566014] TSC ADJUST compensate: CPU2 observed 12 warp. Adjust: 130448479203
[ 57.567354] TSC synchronization [CPU#0 -> CPU#2]:
[ 57.567354] Measured 4 cycles TSC warp between CPUs, turning off TSC clock.
[ 57.626457] clocksource: timekeeping watchdog on CPU0: Marking clocksource 'tsc' as unstable because the skew is too large:
[ 57.626469] clocksource: 'kvm-clock' wd_nsec: 519620295 wd_now: 3ec4536741 wd_last: 3ea55aa07a mask: ffffffffffffffff
[ 57.626473] clocksource: 'tsc' cs_nsec: 467819394 cs_now: 1e6b2800fb cs_last: 1e30bcac51 mask: ffffffffffffffff
[ 57.626480] clocksource: 'kvm-clock' (not 'tsc') is current clocksource.
[ 57.626483] tsc: Marking TSC unstable due to clocksource watchdog
[ 57.627411] Will online and init hotplugged CPU: 2
Actual results:
Do vcpu hotplug on a new created VM, the new cpu is offline on the VM. Soft reboot the VM can fix the cpu offline issue. And the following vcpu hotplug can be success after soft reboot the VM.
Expected results:
The vcpu hotplug on a new created VM should make the new cpu online.
Additional info:
- clones
-
CNV-62851 [Tracker Bug] [vcpu hotplug] the cpu is offline after migration successfully
-
- New
-