Uploaded image for project: 'Satellite'
  1. Satellite
  2. SAT-28464 Failed to import ansible collections with "duplicate key value violates unique" error
  3. SAT-29057

Failed to import ansible collections with "duplicate key value violates unique" error

XMLWordPrintable

    • Icon: Sub-task Sub-task
    • Resolution: Done
    • Icon: Normal Normal
    • None
    • None
    • Pulp
    • None
    • 5
    • False
    • Hide

      None

      Show
      None
    • False
    • pulp-ansible-0.20.9, pulp-ansible-0.21.9, pulp-ansible-0.22.2, pulp-ansible-0.23.0
    • 0

      Description of problem:

      Pulp raised the following error when performing an incremental import for Ansible collections.

       

      Errors:
       {"traceback"=>"  File \"/usr/lib/python3.11/site-packages/pulpcore/tasking/tasks.py\", line 61, in _execute_task
         result = func(*args, **kwargs)
                  ^^^^^^^^^^^^^^^^^^^^^
       File \"/usr/lib/python3.11/site-packages/pulpcore/app/tasks/importer.py\", line 380, in import_repository_version
         for a_result in _import_file(os.path.join(rv_path, filename), res_class, retry=True):
       File \"/usr/lib/python3.11/site-packages/pulpcore/app/tasks/importer.py\", line 268, in _import_file
         a_result = resource.import_data(data, raise_errors=True)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       File \"/usr/lib/python3.11/site-packages/import_export/resources.py\", line 813, in import_data
         result = self.import_data_inner(
                  ^^^^^^^^^^^^^^^^^^^^^^^
       File \"/usr/lib/python3.11/site-packages/import_export/resources.py\", line 882, in import_data_inner
         raise row_result.errors[-1].error
       File \"/usr/lib/python3.11/site-packages/import_export/resources.py\", line 748, in import_row
         self.save_instance(instance, new, using_transactions, dry_run)
       File \"/usr/lib/python3.11/site-packages/import_export/resources.py\", line 491, in save_instance
         instance.save()
       File \"/usr/lib/python3.11/site-packages/pulpcore/app/models/base.py\", line 160, in save
         return super().save(*args, **kwargs)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       File \"/usr/lib64/python3.11/contextlib.py\", line 81, in inner
         return func(*args, **kwds)
                ^^^^^^^^^^^^^^^^^^^
       File \"/usr/lib/python3.11/site-packages/django_lifecycle/mixins.py\", line 169, in save
         save(*args, **kwargs)
       File \"/usr/lib/python3.11/site-packages/django/db/models/base.py\", line 814, in save
         self.save_base(
       File \"/usr/lib/python3.11/site-packages/django/db/models/base.py\", line 877, in save_base
         updated = self._save_table(
                   ^^^^^^^^^^^^^^^^^
       File \"/usr/lib/python3.11/site-packages/django/db/models/base.py\", line 1020, in _save_table
         results = self._do_insert(
                   ^^^^^^^^^^^^^^^^
       File \"/usr/lib/python3.11/site-packages/django/db/models/base.py\", line 1061, in _do_insert
         return manager._insert(
                ^^^^^^^^^^^^^^^^
       File \"/usr/lib/python3.11/site-packages/django/db/models/manager.py\", line 87, in manager_method
         return getattr(self.get_queryset(), name)(*args, **kwargs)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       File \"/usr/lib/python3.11/site-packages/django/db/models/query.py\", line 1805, in _insert
         return query.get_compiler(using=using).execute_sql(returning_fields)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       File \"/usr/lib/python3.11/site-packages/django/db/models/sql/compiler.py\", line 1822, in execute_sql
         cursor.execute(sql, params)
       File \"/usr/lib/python3.11/site-packages/django/db/backends/utils.py\", line 67, in execute
         return self._execute_with_wrappers(
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       File \"/usr/lib/python3.11/site-packages/django/db/backends/utils.py\", line 80, in _execute_with_wrappers
         return executor(sql, params, many, context)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       File \"/usr/lib/python3.11/site-packages/django/db/backends/utils.py\", line 84, in _execute
         with self.db.wrap_database_errors:
       File \"/usr/lib/python3.11/site-packages/django/db/utils.py\", line 91, in __exit__
         raise dj_exc_value.with_traceback(traceback) from exc_value
       File \"/usr/lib/python3.11/site-packages/django/db/backends/utils.py\", line 89, in _execute
         return self.cursor.execute(sql, params)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       File \"/usr/lib/python3.11/site-packages/psycopg/cursor.py\", line 723, in execute
         raise ex.with_traceback(None)\n", "description"=>"duplicate key value violates unique constraint \"unique_is_highest\"\nDETAIL:  Key (collection_id, is_highest)=(019245d8-6ab2-7bfc-9015-fa80c3082eee, t) already exists."} 

       

       

      How reproducible:

      Tricky

       

      Is this issue a regression from an earlier version:

      No

       

      Steps to Reproduce:

      To simulate the issue, we need to hack the Pulp code so that Pulp will generate the rows in certain order.

      1. Edit "/usr/lib/python3.11/site-packages/pulp_ansible/app/modelresource.py" and add the "def export" method to class "CollectionVersionContentResource".

      Why not changing the "def set_up_queryset" instead? See the additional notes section.

      class CollectionVersionContentResource(BaseContentResource):
          <snip>
          def export(self, queryset, *args, **kwargs):
              if queryset:
                  queryset = queryset.order_by("-is_highest")
      
              return super().export(queryset, *args, **kwargs)
      

       

      2. Restart pulpcore services

       

      systemctl restart pulpcore* 

       

      3. Create an ansible collection repository and sync the following collection and version.

       

      collections:
      - name: ansible.posix
        version: 1.5.4
      

       

      4. Create a content view, attach the ansible collection repo and then publish version 1.0

      5. Export the content view version 1.0

      6 . Sync the ansible collection repository again with new versions.

       

      collections:
      - name: ansible.posix
        version: ">=1.5.4"
       

      7. Publish the content view version 2.0

       

      8. Perform an incremental export for the content view version 2.0

      7. Import the content view version 1.0 to another Satellite.

      8. Import the content view version 2.0 to another Satellite.

       

       

      Actual behavior:
      raise ex.with_traceback(None)\n", "description"=>"duplicate key value violates unique constraint \"unique_is_highest\"\nDETAIL:  Key (collection_id, is_highest)=(019245d8-6ab2-7bfc-9015-fa80c3082eee, t) already exists."} 

      Expected behavior:
      Import successfully.

       

      Additional info:{}

      This is the root cause:

      After the first complete import, the "ansible_collectionversion" table in the disconnected Satellite should have the following row:

      pulpcore=# select name, version, is_highest from ansible_collectionversion;
       name  | version | is_highest 
      -------+---------+------------
       posix | 1.5.2   | t
      (1 row) 

      During the incremental import, if the incremental json file has the following order then version (1.6.0, t) will be inserted first and causes error. It is because (1.5.4, t) is still not updated in the disconnected Satellite.

      {
        "namespace": ansible",
        "name": "posix",
        "version": "1.6.0",
        "is_highest": "1",
      }
      ...
      {
        "namespace": ansible",
        "name": "posix",
        "version": "1.5.4",
        "is_highest": "0",
      } 

       

       

      While looking for a solution,

      I tried to order the rows by "is_highest" like below, but "set_up_queryset" doesn't honour it.

          def set_up_queryset(self):
              """
              :return: CollectionVersion content specific to a specified repo-version.
              """
              return CollectionVersion.objects.filter(pk__in=self.repo_version.content).order_by("is_highest")

       

      It is because the queryset result will later be processed in batch while writing it to a json file to save memory. The batch processing code re-fetch the data using their PKs so ordering is lost. See https://github.com/pulp/pulpcore/blob/main/pulpcore/app/importexport.py#L55-L56

      The workaround is to wrap the "export()" method of the model resource to perform the re-ordering.

       

              rhn-engineering-mdellweg Matthias Dellweg
              satellite-jira-automation@redhat.com Satellite Jira-Automation
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: