Uploaded image for project: 'Red Hat OpenStack Services on OpenShift'
  1. Red Hat OpenStack Services on OpenShift
  2. OSPRH-6245

BZ#1866848 [RFE] Limit concurrent Block Storage service backup/restore operations to control memory usage

XMLWordPrintable

    • [RFE] Limit concurrent Block Storage service backup/restore operations to control memory usage
    • False
    • False
    • Committed
    • Proposed
    • OSP-14514 - OSP backup enhancements
    • Committed
    • Committed
    • 50
    • 50% 50%
    • Undefined

      The feature is to be able to limit the number of concurrent backup/restore operations by each cinder-backup service, thus controlling the maximum amount of memory the service will be using.

      +++ This bug was initially created as a clone of Bug #1806975 +++
      +++ Original summary was: cinder backup restore: decompression uses lots of memory +++

      Description of problem:

      unable to restore Cinder volumes created after an FFU upgrade from OSP10 to OSP13

      Noticed nova_api_wsgi and nova-conductor are the current high memory processes.

      It seems that cinder-backup was consuming 162GB of RAM when it was oom killed.

      ~~~
      Feb 24 14:28:18 controller3 kernel: Out of memory: Kill process 2501135 (cinder-backup) score 797 or sacrifice child
      Feb 24 14:28:18 controller3 kernel: Killed process 2501135 (cinder-backup), UID 0, total-vm:195150272kB, anon-rss:162185040kB, file-rss:536kB, shmem-rss:0kB
      Feb 24 14:28:18 controller3 kernel: cinder-backup: page allocation failure: order:0, mode:0x280da
      Feb 24 14:28:18 controller3 kernel: CPU: 13 PID: 2501135 Comm: cinder-backup Kdump: loaded Tainted: G ------------ T 3.10.0-1062.12.1.el7.x86_64 #1
      ~~~

      Also, noticed high resource utilization by snmpd on the same controller

      Version-Release number of selected component (if applicable):

      openstack-cinder-12.0.8-3.el7ost.noarch Fri Feb 7 12:53:05 2020
      puppet-cinder-12.4.1-5.el7ost.noarch Fri Feb 7 12:52:15 2020
      python2-cinderclient-3.5.0-1.el7ost.noarch Fri Feb 7 12:50:55 2020
      python-cinder-12.0.8-3.el7ost.noarch Fri Feb 7 12:53:00 2020

      PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
      2822235 root 20 0 76.5g 76.3g 3296 R 100.0 40.5 867:53.35 snmpd

      1. rpm -qf /usr/sbin/snmpd
        net-snmp-5.7.2-43.el7_7.3.x86_64

      Tried downgrading the net-snmp version but still got the same results.

      How reproducible:

      Steps to Reproduce:
      1. create backup of openstack volume with some large data inside
      2. try to restore multiple backup at the same time.
      3. You will notice OOM

      Actual results:

      cinder-backup getting OOM

      Expected results:

      multiple cinder backup volume should get restored at a time.

      At this moment we are able to restore single volumes, but not multiple volumes at the same time.

            geguileo@redhat.com Gorka Eguileor
            jira-bugzilla-migration RH Bugzilla Integration
            Yosi Ben Shimon Yosi Ben Shimon
            rhos-dfg-storage-squad-cinder
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: