-
Bug
-
Resolution: Done-Errata
-
Major
-
None
-
8
-
False
-
-
False
-
MODIFIED
-
---
-
---
-
-
Storage Core Sprint 257
-
High
-
No
Description of problem:
Cluster often has to attempt to provision a VM multiple times before it actually works. It sometimes fails 2-3 times in a row before it works.
It always works after some retries, but quite often also fails on 1st and 2nd attempts.
I've seen this rarely before, now after upgrading the cluster to 4.13-rc7 (CNV still on 4.12.2) it seems to hit more frequently.
I noticed its writing with "-t writeback", and page cache is most of the cgroup memory limit consumed (537M out of 585M).
This does not seem to be a good option for converting images under a cgroup limit.
107 127318 127223 54 01:15 ? 00:00:10 qemu-img convert -S 0 -t writeback -p -O raw nbd+unix:///?socket=/tmp/nbdkit.sock /dev/cdi-block-volume
May 08 01:05:17 yellow.toca.local kernel: qemu-img invoked oom-killer: gfp_mask=0x101c00(GFP_NOIO|_GFP_HARDWALL|_GFP_WRITE), order=0, oom_score_adj=999
May 08 01:05:17 yellow.toca.local kernel: CPU: 2 PID: 115058 Comm: qemu-img Kdump: loaded Not tainted 5.14.0-284.13.1.el9_2.x86_64 #1
May 08 01:05:17 yellow.toca.local kernel: Call Trace:
May 08 01:05:17 yellow.toca.local kernel: <TASK>
May 08 01:05:17 yellow.toca.local kernel: dump_stack_lvl+0x34/0x48
May 08 01:05:17 yellow.toca.local kernel: dump_header+0x4a/0x201
May 08 01:05:17 yellow.toca.local kernel: oom_kill_process.cold+0xb/0x10
May 08 01:05:17 yellow.toca.local kernel: out_of_memory+0xed/0x2e0
May 08 01:05:17 yellow.toca.local kernel: ? __wake_up_common+0x7d/0x190
May 08 01:05:17 yellow.toca.local kernel: mem_cgroup_out_of_memory+0x13a/0x150
May 08 01:05:17 yellow.toca.local kernel: try_charge_memcg+0x6df/0x7a0
May 08 01:05:17 yellow.toca.local kernel: charge_memcg+0x9f/0x130
May 08 01:05:17 yellow.toca.local kernel: __mem_cgroup_charge+0x29/0x80
May 08 01:05:17 yellow.toca.local kernel: __filemap_add_folio+0x224/0x5a0
May 08 01:05:17 yellow.toca.local kernel: ? scan_shadow_nodes+0x30/0x30
May 08 01:05:17 yellow.toca.local kernel: filemap_add_folio+0x37/0xa0
May 08 01:05:17 yellow.toca.local kernel: __filemap_get_folio+0x1e6/0x330
May 08 01:05:17 yellow.toca.local kernel: ? blkdev_write_begin+0x20/0x20
May 08 01:05:17 yellow.toca.local kernel: pagecache_get_page+0x15/0x90
May 08 01:05:17 yellow.toca.local kernel: block_write_begin+0x24/0xf0
May 08 01:05:17 yellow.toca.local kernel: generic_perform_write+0xbe/0x200
May 08 01:05:17 yellow.toca.local kernel: __generic_file_write_iter+0xe5/0x1a0
May 08 01:05:17 yellow.toca.local kernel: blkdev_write_iter+0xe4/0x160
May 08 01:05:17 yellow.toca.local kernel: new_sync_write+0xfc/0x190
May 08 01:05:17 yellow.toca.local kernel: vfs_write+0x1ef/0x280
May 08 01:05:17 yellow.toca.local kernel: __x64_sys_pwrite64+0x90/0xc0
May 08 01:05:17 yellow.toca.local kernel: do_syscall_64+0x59/0x90
May 08 01:05:17 yellow.toca.local kernel: ? do_futex+0x15b/0x200
May 08 01:05:17 yellow.toca.local kernel: ? __x64_sys_futex+0x73/0x1d0
May 08 01:05:17 yellow.toca.local kernel: ? switch_fpu_return+0x49/0xd0
May 08 01:05:17 yellow.toca.local kernel: ? exit_to_user_mode_prepare+0xec/0x100
May 08 01:05:17 yellow.toca.local kernel: ? syscall_exit_to_user_mode+0x12/0x30
May 08 01:05:17 yellow.toca.local kernel: ? do_syscall_64+0x69/0x90
May 08 01:05:17 yellow.toca.local kernel: ? do_syscall_64+0x69/0x90
May 08 01:05:17 yellow.toca.local kernel: ? sysvec_apic_timer_interrupt+0x3c/0x90
May 08 01:05:17 yellow.toca.local kernel: entry_SYSCALL_64_after_hwframe+0x63/0xcd
May 08 01:05:17 yellow.toca.local kernel: memory: usage 585936kB, limit 585936kB, failcnt 131
May 08 01:05:17 yellow.toca.local kernel: memory+swap: usage 585936kB, limit 9007199254740988kB, failcnt 0
May 08 01:05:17 yellow.toca.local kernel: kmem: usage 17716kB, limit 9007199254740988kB, failcnt 0
May 08 01:05:17 yellow.toca.local kernel: Memory cgroup stats for /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podb18ee349_565f_4cb6_9f2b_f4e67a7a7927.slice:
May 08 01:05:17 yellow.toca.local kernel: anon 44654592
file 537202688
kernel 18141184
kernel_stack 262144
pagetables 1024000
percpu 20856
sock 0
vmalloc 28672
shmem 53248
zswap 0
zswapped 0
file_mapped 0
file_dirty 0
file_writeback 537149440
swapcached 0
anon_thp 14680064
file_thp 0
shmem_thp 0
inactive_anon 44646400
active_anon 61440
inactive_file 268509184
active_file 268574720
unevictable 0
slab_reclaimable 16203816
slab_unreclaimable 497488
slab 16701304
workingset_refault_anon 0
workingset_refault_file 0
workingset_activate_anon 0
workingset_activate_file 0
workingset_restore_anon 0
workingset_restore_file 0
workingset_nodereclaim 0
pgscan 4950026
pgsteal 4
pgscan_kswapd 0
pgscan_direct 4950026
pgsteal_kswapd 0
pgsteal_direct 4
pgfault 13780
pgmajfault 0
pgrefill 4884435
pgactivate 4950019
pgdeactivate 4884435
pglazyfree 0
pglazyfreed 0
zswpin 0
zswpout 0
thp_fault_alloc 13
thp_collapse_alloc 0
May 08 01:05:17 yellow.toca.local kernel: Tasks state (memory values in pages):
May 08 01:05:17 yellow.toca.local kernel: [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
May 08 01:05:17 yellow.toca.local kernel: [ 114988] 0 114988 2076 485 49152 0 -1000 conmon
May 08 01:05:17 yellow.toca.local kernel: [ 115000] 107 115000 335956 13160 401408 0 999 virt-cdi-import
May 08 01:05:17 yellow.toca.local kernel: [ 115032] 107 115032 95298 2604 262144 0 999 nbdkit
May 08 01:05:17 yellow.toca.local kernel: [ 115053] 107 115053 110130 6764 360448 0 999 qemu-img
May 08 01:05:17 yellow.toca.local kernel: oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=crio-3ec0af77f3bab257e52e5a33cfde0e88861bae125913f97b356105b5efd7cc18.scope,mems_allowed=0,oom_memcg=/kubepods.slice/kubepods-burstable.slice>
May 08 01:05:17 yellow.toca.local kernel: Memory cgroup out of memory: Killed process 115000 (virt-cdi-import) total-vm:1343824kB, anon-rss:19060kB, file-rss:33580kB, shmem-rss:0kB, UID:107 pgtables:392kB oom_score_adj:999
The image is sourced from an httpd server, it a qcow2. And written to a topolvm volume.
spec:
dataVolumeTemplates:
- metadata:
creationTimestamp: null
name: rhvh-1-rootdisk
namespace: homelab
spec:
preallocation: false
source:
http:
url: http://pi.toca.local:8080/images/rhel-8.6.qcow2
storage:
resources:
requests:
storage: 100Gi
storageClassName: lvms-ssd-lvmcluster
Version-Release number of selected component (if applicable):
4.12.2
How reproducible:
Always
Steps to Reproduce:
1. Provision a VM
2. Watch the logs on the node and it may fail 1-2 times before a retry succeeds
- clones
-
CNV-44478 [4.16] virt-cdi-import often killed by oom-killer during VM provisioning
- Closed
- links to
-
RHEA-2024:132739 OpenShift Virtualization 4.14.7 Images