Uploaded image for project: 'Project Quay'
  1. Project Quay
  2. PROJQUAY-6140

Auto-Pruning Policies for Repositories

XMLWordPrintable

    • Icon: Feature Feature
    • Resolution: Done
    • Icon: Major Major
    • None
    • quay-v3.10.0
    • quay
    • None
    • BU Product Work
    • False
    • None
    • False
    • Green
    • 0% To Do, 0% In Progress, 100% Done

      Goal:

      To provide auto-pruning at a granular level targeting one or more repositories in a organization to help Quay organization owners stay below the storage quota by automatically pruning content based on user-defined criteria in the form of policies.

      Why is this important?

      Not only has this been a top ask from customers but generally speaking, auto-pruning policies play a crucial role in managing repository size and optimizing storage utilization. By defining these policies at the organization level, we can ensure consistency and streamline the process across multiple repositories. This helps in reducing storage costs, improving performance, and maintaining a clean and manageable codebase. This can also delight users by automating the process of removing outdated images.

      Relation to Storage Quota Limits

      • Auto-pruning policies are decoupled from any storage quota limit the organization has applied, the user cannot expect to stay below the quota limit automatically by auto-pruning, it is still their responsibility
      • The benefit is that by having auto-pruning enabled the need for user intervention due to quota limits being reached is  drastically lowered (albeit not entirely ruled out)

      Desired outcomes:

      • As customer requested: Investigate how we can create a mechanism to prune by pull date as per the autopruning roadmap
      • Policies can be written by whoever has creator access permissions on a given repository. 
      • Repositories with auto-pruning policies attached to it will have auto-pruning occurring in the background at regular intervals without user intervention (this needs to be scalable to Quay.io dimensions)
      • When auto-pruning actually removes images this can tracked by the repo owner using the usage logs and (if time machine is enabled) using the tag history view

      User Scenarios:

      • As an organization administrator or repo creator, I want to define auto-pruning policies to remove outdated or unused content from repositories automatically. The policy can be driven based on 2 criteria: quantity of time (tag age) and/or quantity of tags. 
      • As an organization administrator or repo creator, I want to configure as part of the policy whether pruned tags will go to time machine first are immediately deleted or not to help accelerate space savings / staying below quota.
      • As an organization administrator or repo creator, I can only configure one policy per repository.
      • As an organization administrator I can use time machine to restore pruned tags in the time machine windows defined at the organization level, if time machine wasn't disabled as part of the policy.

      The policy could either be:

      1) If the repo has more then X amount of tags, start deleting tags until the repos has less or equal to X amount of tags, start with the oldest tags first using tag creation date

      2) If the repo has tags that are older than Y amount of days (using tag creation date as the start), delete them.

      The MVP can implement 2 policies that users can configure and depending on which has the most opt in we can keep this policy in future Iterations.

      Example

      Users can configure an auto-deletion policy in the container registry based on two criteria:

      1. Auto-Pruning Tags Older Than 30 Days:
      A user specifies a pruning policy that will prune tags older than 30 days. Auto-pruning runs regularly in the background to catch the point in time when there are tags that are older than 30 days. Once this is detected, the tags older than 30 days get deleted. Since the pruner runs asynchronously in the background it is possible that for short periods of time tags older than 30 days are still in the repo.

      2. Auto-Pruning when there are more than 100 tags: A users specifies a pruning policy that will prune tags when a threshold of 100 tags in repository is crossed. Once this is detected, the auto-pruning logic will start to sort all the tags in the repo by creation date, starting with the oldest, and remove as many tags as necessary until there are 100 tags left in the repository. Since the pruner runs asynchronously in the background it is possible that for short periods of time more than 100 tags are in the repository.

      These 2 policy options provide more flexibility and control over the container image retention process, allowing users to define either time-based and quantity-based criteria over tags.

      This helps optimize storage space by automatically deleting outdated or unnecessary container images, while still preserving the more recent content as defined by users. It is up to the user to define the pruning criteria suitable to their definition of "recent content that is still in use or might still be useful". This is a tradeoff until we can support policies based taking into account when a tag was last pulled.

      Acceptance Criteria:

      1. As a developer, I can specify a policy that deletes tags older than a specified time-based criteria.
      2. As a developer, I can specify a policy that deletes tags using specified tag-number-based criteria.
      3. As a developer, I can ensure that the auto-pruning process does not delete content that doesn't follow the defined policies and have a backup setup in time machine in the event of accidental deletion. 
      4. Continuous Integration (CI): Tests should be automated to verify the pruning mechanism and ensure code integrity.
      5. Release Technical Enablement: Provide necessary release enablement details and documentation related to the auto-pruning feature functionality.

      Previous Work (Optional):

      • Researching and identifying existing auto-pruning tools and techniques.
      • Analyzing common usage patterns within repos and organizations to set efficient policies that optimize user's workflows on Quay

      Open Questions:

      1. What are suitable regular intervals the auto-pruning logic executes in the background?
      2. Are there any specific compliance or regulatory requirements to consider while implementing auto-pruning policies?

       

              hgovinda Harish Govindarajulu
              doconnor@redhat.com Dave O'Connor
              Eric Rich Eric Rich
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: