Uploaded image for project: 'Satellite'
  1. Satellite
  2. SAT-26458 Cannot import chunked export if the resulted tarball is smaller than chunk size
  3. SAT-26599

[QE] Cannot import chunked export if the resulted tarball is smaller than chunk size

XMLWordPrintable

    • Icon: Sub-task Sub-task
    • Resolution: Done
    • Icon: Undefined Undefined
    • 6.15.3
    • None
    • Pulp
    • Sprint 136, Sprint 137, Sprint 138

      Description of problem:

      When exporting content using chunks, if the resulting exported content is smaller than the chunk size, the import fails with the following error:

      ~~~

      {"traceback"=>" File \"/usr/lib/python3.9/site-packages/pulpcore/tasking/pulpcore_worker.py\", line 460, in execute_task\n result = func(*args, **kwargs)\n File \"/usr/lib/python3.9/site-packages/pulpcore/app/tasks/importer.py\", line 478, in pulp_import\n with tarfile.open(path, read_mode) as tar:\n File \"/usr/lib64/python3.9/tarfile.py\", line 1817, in open\n return func(name, filemode, fileobj, **kwargs)\n File \"/usr/lib64/python3.9/tarfile.py\", line 1863, in gzopen\n fileobj = GzipFile(name, mode + \"b\", compresslevel, fileobj)\n File \"/usr/lib64/python3.9/gzip.py\", line 173, in __init__\n fileobj = self.myfileobj = builtins.open(filename, mode or 'rb')\n", "description"=>"[Errno 2] No such file or directory: '/var/lib/pulp/imports/2.0/2024-02-26T10-34-29-05-00/export-7a3b71e4-8acb-4c32-a473-cb08ef899047-20240226_1534.tar.gz'"}

      ~~~

      Version-Release number of selected component (if applicable):

      Satellite 6.12
      Satellite 6.13
      Satellite 6.14

      How reproducible:

      Always

      Steps to Reproduce:

      1. Chose a small repository to export (example: satellite-client-6-for-rhel-9-x86_64-rpms, it generates a 67Mb tarball)

      2. Export it using a chunk bigger than the repo size (use 1GB for example):

      ~~~
      hammer content-export complete repository --id 17 --chunk-size-gb 1
      ~~~

      3. Check the result toc file:

      ~~~

      1. cat export-0d7bc5de-77e3-4113-8de3-4fc2f99bad2d-20240226_1531-toc.json|jq
        {
        "meta": { "chunk_size": 1073741824, "file": "export-0d7bc5de-77e3-4113-8de3-4fc2f99bad2d-20240226_1531.tar.gz", "global_hash": "de2f14f337e48291b2cba8e7d04d08c3b60b578378072a106be77668c46d9010", "checksum_type": "sha256" }

        ,
        "files":

        { "export-0d7bc5de-77e3-4113-8de3-4fc2f99bad2d-20240226_1531.tar.gz.0000": "de2f14f337e48291b2cba8e7d04d08c3b60b578378072a106be77668c46d9010" }

        }
        ~~~

      4. Try importing it:

      ~~~

      1. hammer content-import repository --organization-id 1 --path $(pwd)
        [....................................................................................................................................................................................................................................] [100%]
        Error: 1 subtask(s) failed for task group /pulp/api/v3/task-groups/4665687d-3fb3-48fe-bab5-fd3e45988f70/.
        Errors: {"traceback"=>" File \"/usr/lib/python3.9/site-packages/pulpcore/tasking/pulpcore_worker.py\", line 460, in execute_task\n result = func(*args, **kwargs)\n File \"/usr/lib/python3.9/site-packages/pulpcore/app/tasks/importer.py\", line 478, in pulp_import\n with tarfile.open(path, read_mode) as tar:\n File \"/usr/lib64/python3.9/tarfile.py\", line 1817, in open\n return func(name, filemode, fileobj, **kwargs)\n File \"/usr/lib64/python3.9/tarfile.py\", line 1863, in gzopen\n fileobj = GzipFile(name, mode + \"b\", compresslevel, fileobj)\n File \"/usr/lib64/python3.9/gzip.py\", line 173, in __init__\n fileobj = self.myfileobj = builtins.open(filename, mode or 'rb')\n", "description"=>"[Errno 2] No such file or directory: '/var/lib/pulp/imports/2.0/2024-02-26T10-34-29-05-00/export-7a3b71e4-8acb-4c32-a473-cb08ef899047-20240226_1534.tar.gz'"}

        ~~~

      Actual results:

      Import fails saying it can't find the tar.gz file

      Expected results:

      Import should process the toc properly and identify that there is a "chunked" file, rename it to the expected tarball name and import.

      Additional info:

      This used to work on previous versions.

      Problem is here:

      vim /usr/lib/python3.9/site-packages/pulpcore/app/tasks/importer.py +456

      ~~~
      448 def validate_and_assemble(toc_filename):
      449 """Validate checksums of, and reassemble, chunks in table-of-contents file."""
      450 the_toc = validate_toc(toc_filename)
      451 toc_dir = os.path.dirname(toc_filename)
      452 result_file = os.path.join(toc_dir, the_toc["meta"]["file"])
      453
      454 # if we have only one entry in "files", it must be the full .tar. <===== this is not actually true
      455 # Return the filename from the meta-section.
      456 if len(the_toc["files"]) == 1:
      457 return result_file
      458
      459 # We have multiple chunks. Reassemble them and return the result.
      460 return reassemble(the_toc, toc_dir, result_file)

      ~~~

      Could be a quick fix as replacing line 456 for this:

      ~~~

      1. git diff
        diff --git a/importer.py b/importer.py
        index 4edcba8..77f1663 100644
          • a/importer.py
            +++ b/importer.py
            @@ -453,7 +453,7 @@ def pulp_import(importer_pk, path, toc, create_repositories):
      1. if we have only one entry in "files", it must be the full .tar.
      2. Return the filename from the meta-section.
      • if len(the_toc["files"]) == 1:
        + if len(the_toc["files"]) == 1 and the_toc["meta"]["chunk_size"] == 0:
        return result_file
      1. We have multiple chunks. Reassemble them and return the result.
        ~~~

      I would propose a PR with that, but I see that latest upstream code is already different and I think this issue will not happen on that newer code. It would be nice to have a fix downstream soon.

      This affects 6.12, 6.13 and 6.14.

      Used to work on 6.11

      QE Tracker for https://issues.redhat.com/browse/SAT-26458
      Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2266075

              vsedmik@redhat.com Vladimír Sedmík
              satellite-jira-automation@redhat.com Satellite Jira-Automation
              Vladimír Sedmík Vladimír Sedmík
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: