-
Bug
-
Resolution: Unresolved
-
Critical
-
CNV v4.16.1
-
None
-
5
-
False
-
-
False
-
---
-
---
-
-
CNV Virtualization Sprint 260, CNV Virtualization Sprint 261, CNV Virtualization Sprint 262
-
Urgent
-
None
Description of problem:
On 4.16, if a VM is created where total vcpus are more than the cpus the host has in a single numa node, it fails to start with this error: default 0s Warning SyncFailed virtualmachineinstance/vm-big server error. command SyncVMI failed: "LibvirtError(Code=67, Domain=10, Message='unsupported configuration: more than 255 vCPUs require extended interrupt mode enabled on the iommu device')"
Version-Release number of selected component (if applicable):
OCP 4.16.6 CNV 4.16.1
How reproducible:
reported across multiple internal clusters
Steps to Reproduce:
1. Start VM w/ high vcpu count 2. Check event logs 3. VM fails to start
Actual results:
VM fails
Expected results:
VM can succeed w/ many vcpus
Additional info:
Using very simple VM definition, setting vcpus > 1numa node cpus (on a 128cpu host): apiVersion: kubevirt.io/v1 kind: VirtualMachine metadata: labels: app: vm-big name: vm-big spec: running: true template: metadata: labels: kubevirt.io/domain: vm-big spec: domain: cpu: cores: 1 sockets: 100 threads: 1 devices: disks: - disk: bus: virtio name: containerdisk - disk: bus: virtio name: cloudinitdisk interfaces: - masquerade: {} model: virtio name: default networkInterfaceMultiqueue: true rng: {} features: smm: enabled: true firmware: bootloader: efi: {} machine: type: pc-q35-rhel9.2.0 memory: guest: 10Gi networks: - name: default pod: {} terminationGracePeriodSeconds: 180 nodeSelector: kubernetes.io/hostname: worker00 volumes: - containerDisk: image: quay.io/kubevirt/fedora-container-disk-images:35 imagePullPolicy: IfNotPresent name: containerdisk - cloudInitNoCloud: userData: |- #cloud-config user: fedora password: perf chpasswd: { expire: False } runcmd: - sed -i -e "s/PasswordAuthentication.*/PasswordAuthentication yes/" /etc/ssh/sshd_config - systemctl restart sshd name: cloudinitdisk
- is related to
-
RHEL-65844 Wrong iommu default when more than 255 vcpus are requested
- In Progress
- relates to
-
RHEL-65836 Wrong iommu default when more than 255 vcpus are requested
- Closed
- links to