Uploaded image for project: 'Project Quay'
  1. Project Quay
  2. PROJQUAY-3933

Blobs don't get removed when local storage is used

XMLWordPrintable

    • False
    • None
    • False
    • Quay Enterprise
    • 0

      When local storage is used, it appears as though garbage collection of blobs is not working. I have pushed one image to an otherwise empty Quay instance (fresh installation) and have set tag expiration to a few seconds. This is how my uploadedblob table looks like:

      MariaDB [quay_local_storage]> select * from uploadedblob;
      +----+---------------+---------+---------------------+---------------------+
      | id | repository_id | blob_id | uploaded_at         | expires_at          |
      +----+---------------+---------+---------------------+---------------------+
      |  1 |             1 |       1 | 2022-06-09 11:32:00 | 2022-06-09 12:32:00 |
      |  2 |             1 |       2 | 2022-06-09 11:32:09 | 2022-06-09 12:32:09 |
      |  3 |             1 |       3 | 2022-06-09 11:32:24 | 2022-06-09 12:32:24 |
      |  4 |             1 |       4 | 2022-06-09 11:32:24 | 2022-06-09 12:32:24 |
      +----+---------------+---------+---------------------+---------------------+
      4 rows in set (0.000 sec)
      

      After the upload, I decided to delete the tag and see what is going on. This is the deletion log:

      gunicorn-web stdout | 2022-06-09 11:33:42,249 [244] [INFO] [gunicorn.access] 172.24.0.1 - - [09/Jun/2022:11:33:42 +0000] "DELETE /api/v1/repository/ibazulic/quay/tag/v3.7.1 HTTP/1.0" 204 0 "https://quay.skynet/repository/ibazulic/quay?tab=tags" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:101.0) Gecko/20100101 Firefox/101.0"
      

      The time is 11:33:42 UTC. According to the uploadedblob table, all blobs should expire after 12:32:24. GC of the tag itself started at 11:42:37:

      gcworker stdout | 2022-06-09 11:42:37,504 [71] [DEBUG] [apscheduler.scheduler] Looking for jobs to run
      gcworker stdout | 2022-06-09 11:42:37,504 [71] [DEBUG] [apscheduler.scheduler] Next wakeup is due at 2022-06-09 11:43:07.490424+00:00 (in 29.986004 seconds)
      gcworker stdout | 2022-06-09 11:42:37,504 [71] [INFO] [apscheduler.executors.default] Running job "GarbageCollectionWorker._garbage_collection_repos (trigger: interval[0:00:30], next run at: 2022-06-09 11:43:07 UTC)" (scheduled a
      t 2022-06-09 11:42:37.490424+00:00)
      gcworker stdout | 2022-06-09 11:42:37,504 [71] [DEBUG] [peewee] ('SELECT DISTINCT `t1`.`removed_tag_expiration_s` FROM `user` AS `t1` LIMIT %s', [100])
      gcworker stdout | 2022-06-09 11:42:37,507 [71] [DEBUG] [peewee] ('SELECT `candidates`.`repository_id` FROM (SELECT DISTINCT `t1`.`repository_id` FROM `tag` AS `t1` INNER JOIN `repository` AS `t2` ON (`t1`.`repository_id` = `t2`.`
      id`) INNER JOIN `user` AS `t3` ON (`t2`.`namespace_user_id` = `t3`.`id`) WHERE ((((NOT (`t1`.`lifetime_end_ms` IS %s) AND (`t1`.`lifetime_end_ms` <= %s)) AND (`t3`.`removed_tag_expiration_s` = %s)) AND (`t3`.`enabled` = %s)) AND
      (`t2`.`state` != %s)) LIMIT %s) AS `candidates` ORDER BY Rand() LIMIT %s OFFSET %s', [None, 1654774957507, 0, True, 3, 500, 1, 0])
      ...
      gcworker stdout | 2022-06-09 11:42:37,625 [71] [DEBUG] [data.model.storage] Garbage collecting storages from candidates: {1, 2, 3, 4}
      ...
      

      The garbate collection has ended at 11:42:37.6 seconds:

      ...
      gcworker stdout | 2022-06-09 11:42:37,667 [71] [DEBUG] [util.locking] Released lock REPO_GARBAGE_COLLECTION_1
      gcworker stdout | 2022-06-09 11:42:37,667 [71] [DEBUG] [data.database] Disconnecting from database.
      gcworker stdout | 2022-06-09 11:42:37,667 [71] [INFO] [apscheduler.executors.default] Job "GarbageCollectionWorker._garbage_collection_repos (trigger: interval[0:00:30], next run at: 2022-06-09 11:43:07 UTC)" executed successfully
      

      I then waited for an hour to see if blobs will be removed once their expiry time is reached. There is no indication in the logs that this has been done. In fact, I still see them in the uploadedblob table over an hour later and I can see them in du output as well:

      root@cyberdyne:/storage/local-storage# du -h
      0       ./uploads
      4.0K    ./sha256/54
      75M     ./sha256/f7
      187M    ./sha256/51
      8.0K    ./sha256/c6
      261M    ./sha256
      261M    .
      

      Can you please check why GC is not hapenning?

            rhn-support-stevsmit Steven Smith
            rhn-support-ibazulic Ivan Bazulic
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: