• Icon: Sub-task Sub-task
    • Resolution: Done
    • Icon: Normal Normal
    • None
    • 6.11.5
    • Pulp
    • 40
    • False
    • Hide

      None

      Show
      None
    • False
    • 0
    • Phoenix - Content
    • 40

      Description of problem

      In the case of "on demand" download policy, if there are multiple remote URL for the similar rpms (same NVERA but have different checksum), Pulp may pick the wrong remote URL to download the rpm. Example:

      • Given some third party repositories re-signed Red Hat rpm without changing the rpm name.
      • When customer on-demand syncs both Red Hat repository and the third party repository containing the same rpm.
      • Then content-app may try to return content (by its NEVRA) with the wrong remote.

      Implemented Solution

      There are multiple scenarios where some of these RAs won't contain the right binary, besides this specific case.
      And Pulp can't completely avoid the existence of a corrupted RemoteArtifact, as these are by definition outside Pulp's control.

      Because of that and after a lot of team discussion, I decided to more generic on the approach and improve the content-app handling of these edge cases.

      In summary:

      • Before: A corrupted RA would make it impossible to get the content, even if there were good RAs in the system.
      • After: The first request may fail if it picks the bad RA, but subsequent requests will be able to pick the good one.

      In more details:

      • Given Pulp has a corrupted RemoteArtifact for some Content (something Pulp can't completely avoid in on-demand)
      • And there is also a good RemoteArtifact
      • When customer requests for this Content and content-app chooses the corrupted one
      • Then content-app will stream until it detects the checksum is wrong and close the connection
      • When customer request again for the content with in a configurable cooldown time
      • Then content-app will ignore the FailedRemote artifacts and be able to hit the good one

      Related PRs:

      Other approaches considered

      • Don't do direct streaming of content:
        • Pros: the problem is that when Pulp can tell the digest is wrong, it already streamed content to the client. This enable Pulp to start sending data only after it knew the content was good.
        • Cons: potential  problems with hanging connections (client could potentially timeout) and overall time overhead for the happy case (when RA is good)
      • Re-design on-demand in Pulp:
        • Pros: there are a number of related RA issues and Pulp team hoped we could fix all of them with a better design
        • Cons: the team discussed this a lot and could not find a good candidate for a better design (neither has capacity for a change of this magnitude)

       

              rh-ee-pbrochad Pedro Brochado
              satellite-jira-automation@redhat.com Satellite Jira-Automation
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved:

                  Estimated:
                  Original Estimate - 0 minutes
                  0m
                  Remaining:
                  Remaining Estimate - 0 minutes
                  0m
                  Logged:
                  Time Spent - 2 hours
                  2h