Description of problem:
When exporting content using chunks, if the resulting exported content is smaller than the chunk size, the import fails with the following error:
~~~
{"traceback"=>" File \"/usr/lib/python3.9/site-packages/pulpcore/tasking/pulpcore_worker.py\", line 460, in execute_task\n result = func(*args, **kwargs)\n File \"/usr/lib/python3.9/site-packages/pulpcore/app/tasks/importer.py\", line 478, in pulp_import\n with tarfile.open(path, read_mode) as tar:\n File \"/usr/lib64/python3.9/tarfile.py\", line 1817, in open\n return func(name, filemode, fileobj, **kwargs)\n File \"/usr/lib64/python3.9/tarfile.py\", line 1863, in gzopen\n fileobj = GzipFile(name, mode + \"b\", compresslevel, fileobj)\n File \"/usr/lib64/python3.9/gzip.py\", line 173, in __init__\n fileobj = self.myfileobj = builtins.open(filename, mode or 'rb')\n", "description"=>"[Errno 2] No such file or directory: '/var/lib/pulp/imports/2.0/2024-02-26T10-34-29-05-00/export-7a3b71e4-8acb-4c32-a473-cb08ef899047-20240226_1534.tar.gz'"}~~~
Version-Release number of selected component (if applicable):
Satellite 6.12
Satellite 6.13
Satellite 6.14
How reproducible:
Always
Steps to Reproduce:
1. Chose a small repository to export (example: satellite-client-6-for-rhel-9-x86_64-rpms, it generates a 67Mb tarball)
2. Export it using a chunk bigger than the repo size (use 1GB for example):
~~~
hammer content-export complete repository --id 17 --chunk-size-gb 1
~~~
3. Check the result toc file:
~~~
- cat export-0d7bc5de-77e3-4113-8de3-4fc2f99bad2d-20240226_1531-toc.json|jq
{
"meta": { "chunk_size": 1073741824, "file": "export-0d7bc5de-77e3-4113-8de3-4fc2f99bad2d-20240226_1531.tar.gz", "global_hash": "de2f14f337e48291b2cba8e7d04d08c3b60b578378072a106be77668c46d9010", "checksum_type": "sha256" },
{ "export-0d7bc5de-77e3-4113-8de3-4fc2f99bad2d-20240226_1531.tar.gz.0000": "de2f14f337e48291b2cba8e7d04d08c3b60b578378072a106be77668c46d9010" }
"files":}
~~~
4. Try importing it:
~~~
- hammer content-import repository --organization-id 1 --path $(pwd)
[....................................................................................................................................................................................................................................] [100%]
Error: 1 subtask(s) failed for task group /pulp/api/v3/task-groups/4665687d-3fb3-48fe-bab5-fd3e45988f70/.
Errors: {"traceback"=>" File \"/usr/lib/python3.9/site-packages/pulpcore/tasking/pulpcore_worker.py\", line 460, in execute_task\n result = func(*args, **kwargs)\n File \"/usr/lib/python3.9/site-packages/pulpcore/app/tasks/importer.py\", line 478, in pulp_import\n with tarfile.open(path, read_mode) as tar:\n File \"/usr/lib64/python3.9/tarfile.py\", line 1817, in open\n return func(name, filemode, fileobj, **kwargs)\n File \"/usr/lib64/python3.9/tarfile.py\", line 1863, in gzopen\n fileobj = GzipFile(name, mode + \"b\", compresslevel, fileobj)\n File \"/usr/lib64/python3.9/gzip.py\", line 173, in __init__\n fileobj = self.myfileobj = builtins.open(filename, mode or 'rb')\n", "description"=>"[Errno 2] No such file or directory: '/var/lib/pulp/imports/2.0/2024-02-26T10-34-29-05-00/export-7a3b71e4-8acb-4c32-a473-cb08ef899047-20240226_1534.tar.gz'"}~~~
Actual results:
Import fails saying it can't find the tar.gz file
Expected results:
Import should process the toc properly and identify that there is a "chunked" file, rename it to the expected tarball name and import.
Additional info:
This used to work on previous versions.
Problem is here:
vim /usr/lib/python3.9/site-packages/pulpcore/app/tasks/importer.py +456
~~~
448 def validate_and_assemble(toc_filename):
449 """Validate checksums of, and reassemble, chunks in table-of-contents file."""
450 the_toc = validate_toc(toc_filename)
451 toc_dir = os.path.dirname(toc_filename)
452 result_file = os.path.join(toc_dir, the_toc["meta"]["file"])
453
454 # if we have only one entry in "files", it must be the full .tar. <===== this is not actually true
455 # Return the filename from the meta-section.
456 if len(the_toc["files"]) == 1:
457 return result_file
458
459 # We have multiple chunks. Reassemble them and return the result.
460 return reassemble(the_toc, toc_dir, result_file)
~~~
Could be a quick fix as replacing line 456 for this:
~~~
- git diff
diff --git a/importer.py b/importer.py
index 4edcba8..77f1663 100644-
- a/importer.py
+++ b/importer.py
@@ -453,7 +453,7 @@ def pulp_import(importer_pk, path, toc, create_repositories):
- a/importer.py
-
- if we have only one entry in "files", it must be the full .tar.
- Return the filename from the meta-section.
- if len(the_toc["files"]) == 1:
+ if len(the_toc["files"]) == 1 and the_toc["meta"]["chunk_size"] == 0:
return result_file
- We have multiple chunks. Reassemble them and return the result.
~~~
I would propose a PR with that, but I see that latest upstream code is already different and I think this issue will not happen on that newer code. It would be nice to have a fix downstream soon.
This affects 6.12, 6.13 and 6.14.
Used to work on 6.11
QE Tracker for https://issues.redhat.com/browse/SAT-23573
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2266075