Uploaded image for project: 'OpenShift API for Data Protection'
  1. OpenShift API for Data Protection
  2. OADP-2681

[Upstream] DataMover - datauploads and datadownloads resources aren't distributed equally among the workers.

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done-Errata
    • Icon: Normal Normal
    • OADP 1.3.0
    • OADP 1.3.0
    • data-mover
    • False
    • Hide

      None

      Show
      None
    • False
    • oadp-operator-bundle-container-1.3.0-147
    • ToDo
    • 0
    • 0
    • Very Likely
    • 0
    • None
    • Unset
    • Unknown
    • No

      Description of problem:

      Tracking bug for upstream Velero 1.12 - https://github.com/vmware-tanzu/velero/issues/6734

       

      Running datamover backup and restore. During the tests, monitoring the datauploads and datadownloads resources and noticed the resources weren't distributed equally among the workers.
      It causes the tests to run long duration, also while running a few cycles of the same test - the results are inconsistent.

      Another issue - the test is not run with the max concurrent (1 resource per node)

       

      Version-Release number of selected component (if applicable):

      OCP 4.12.9

      ODF 4.12.7

      Upstream Velero 1.12

       

      How reproducible:

       

      Steps to Reproduce:
      1. Run datamover backup / restore
      2. monitor datauploads / datadownloads resources
      3. 

      Actual results:

      The resources not distributed among all workers

      Expected results:

      The resources should distributed among all workers equally as much as it can.

       

       

      Additional info:

       

      5 backup cycles duration and datauploads distributed (ns with 100 PVs):
      -0:23:48
      worker000-r640 : 3, worker001-r640 : 61, worker002-r640 : 0, worker003-r640 : 4, worker004-r640 : 32, worker005-r640 : 0
      -0:16:39
      worker000-r640 : 11, worker001-r640 : 23, worker002-r640 : 14, worker003-r640 : 20, worker004-r640 : 19,worker005-r640 : 13
      -0:17:53
      worker000-r640 : 20, worker001-r640 : 17, worker002-r640 : 15, worker003-r640 : 16, worker004-r640 : 9, worker005-r640 : 23
      -0:18:45
      worker000-r640 : 24, worker001-r640 : 15, worker002-r640 : 22, worker003-r640 : 17, worker004-r640 : 6, worker005-r640 : 16
      -0:28:39
      worker000-r640 : 26, worker001-r640 : 15, worker002-r640 : 20, worker003-r640 : 20, worker004-r640 : 2, worker005-r640 : 17

      5 restore cycles datadownloads distributed (ns with 100 PVs):
      -worker000-r640 : 5, worker001-r640 : 51, worker002-r640 : 0, worker003-r640 : 17, worker004-r640 : 27, worker005-r640 : 0
      -worker000-r640 : 24, worker001-r640 : 13, worker002-r640 : 0, worker003-r640 : 22, worker004-r640 : 24, worker005-r640 : 17
      -worker000-r640 : 28, worker001-r640 : 12, worker002-r640 : 10, worker003-r640 : 23, worker004-r640 : 14, worker005-r640 : 13
      -worker000-r640 : 21, worker001-r640 : 18, worker002-r640 : 10, worker003-r640 : 18, worker004-r640 : 17, worker005-r640 : 16
      -worker000-r640 : 15, worker001-r640 : 17, worker002-r640 : 11, worker003-r640 : 21, worker004-r640 : 21, worker005-r640 : 15

              wnstb Wes Hayutin
              dvaanunu@redhat.com David Vaanunu
              David Vaanunu David Vaanunu
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: