-
Bug
-
Resolution: Done
-
Major
-
quay-v3.2.0
Client is using 3 mirror workers in their setup. All mirror workers sometimes lock up and mirroring stops on all repos, no matter how many actually are in the queue. In client's case, this happened in a window of 2 hours, 3 mirror instances stopped mirroring without any error indication:
1st instance:
2020-01-24 10:29:00,743 [166] [DEBUG] [util.repomirror.skopeomirror] Skopeo [STDERR]: time="2020-01-24T10:27:57Z" level=debug msg="GET https://auth.docker.io/token?scope=repository%3Alibrary%2Fphp%3Apull&service=registry.docker.io" 2020-01-24 10:29:00,743 [166] [DEBUG] [util.repomirror.skopeomirror] Skopeo [STDOUT]: "5-fpm-alpine3.4", 2020-01-24 10:29:00,744 [166] [DEBUG] [util.repomirror.skopeomirror] Skopeo [STDERR]: time="2020-01-24T10:27:57Z" level=debug msg="GET https://registry-1.docker.io/v2/library/php/tags/list" 2020-01-24 10:29:00,744 [166] [DEBUG] [util.repomirror.skopeomirror] Skopeo [STDOUT]: "5-fpm-alpine3.7", 2020-01-24 10:29:19,724 [166] [DEBUG] [apscheduler.scheduler] Looking for jobs to run repomirrorworker stdout | 2020-01-24 10:29:19,724 [166] [DEBUG] [apscheduler.scheduler] Looking for jobs to run 2020-01-24 10:29:19,725 [166] [WARNING] [apscheduler.scheduler] Execution of job "_process_mirrors (trigger: interval[0:00:30], next run at: 2020-01-24 10:29:19 UTC)" skipped: maximum number of running instances reached (1) 2020-01-24 10:29:19,725 [166] [DEBUG] [apscheduler.scheduler] Next wakeup is due at 2020-01-24 10:29:49.724230+00:00 (in 29.999056 seconds) repomirrorworker stdout | 2020-01-24 10:29:19,725 [166] [WARNING] [apscheduler.scheduler] Execution of job "_process_mirrors (trigger: interval[0:00:30], next run at: 2020-01-24 10:29:19 UTC)" skipped: maximum number of running instances reached (1)
2nd instance:
2020-01-24 12:10:14,126 [166] [DEBUG] [util.repomirror.skopeomirror] Skopeo [STDERR]: time="2020-01-24T12:10:11Z" level=debug msg="GET https://auth.docker.io/token?scope=repository%3Ahasura%2Fgraphql-engine%3Apull&service=registry.docker.io" 2020-01-24 12:10:14,126 [166] [DEBUG] [util.repomirror.skopeomirror] Skopeo [STDOUT]: "pg-12-tests-304abecc", 2020-01-24 12:10:14,126 [166] [DEBUG] [util.repomirror.skopeomirror] Skopeo [STDERR]: time="2020-01-24T12:10:12Z" level=debug msg="GET https://registry-1.docker.io/v2/hasura/graphql-engine/tags/list" 2020-01-24 12:10:14,126 [166] [DEBUG] [util.repomirror.skopeomirror] Skopeo [STDOUT]: "pull1003-3345622", 2020-01-24 12:10:41,149 [166] [DEBUG] [apscheduler.scheduler] Looking for jobs to run 2020-01-24 12:10:41,149 [166] [WARNING] [apscheduler.scheduler] Execution of job "_process_mirrors (trigger: interval[0:00:30], next run at: 2020-01-24 12:10:41 UTC)" skipped: maximum number of running instances reached (1) 2020-01-24 12:10:41,149 [166] [DEBUG] [apscheduler.scheduler] Next wakeup is due at 2020-01-24 12:11:11.148209+00:00 (in 29.998826 seconds) repomirrorworker stdout | 2020-01-24 12:10:41,149 [166] [DEBUG] [apscheduler.scheduler] Looking for jobs to run 2020-01-24 12:10:41,149 [166] [WARNING] [apscheduler.scheduler] Execution of job "_process_mirrors (trigger: interval[0:00:30], next run at: 2020-01-24 12:10:41 UTC)" skipped: maximum number of running instances reached (1) 2020-01-24 12:10:41,149 [166] [DEBUG] [apscheduler.scheduler] Next wakeup is due at 2020-01-24 12:11:11.148209+00:00 (in 29.998826 seconds)
3rd instance:
2020-01-24 08:29:09,931 [166] [DEBUG] [util.repomirror.skopeomirror] Skopeo [STDERR]: time="2020-01-24T08:29:07Z" level=debug msg="GET https://auth.docker.io/token?scope=repository%3Ahasura%2Fgraphql-engine%3Apull&service=registry.docker.io" 2020-01-24 08:29:09,931 [166] [DEBUG] [util.repomirror.skopeomirror] Skopeo [STDOUT]: "pg-12-tests-304abecc", 2020-01-24 08:29:09,931 [166] [DEBUG] [util.repomirror.skopeomirror] Skopeo [STDERR]: time="2020-01-24T08:29:08Z" level=debug msg="GET https://registry-1.docker.io/v2/hasura/graphql-engine/tags/list" 2020-01-24 08:29:09,931 [166] [DEBUG] [util.repomirror.skopeomirror] Skopeo [STDOUT]: "pull1003-3345622", 2020-01-24 08:29:10,133 [166] [DEBUG] [apscheduler.scheduler] Looking for jobs to run 2020-01-24 08:29:10,134 [166] [WARNING] [apscheduler.scheduler] Execution of job "_process_mirrors (trigger: interval[0:00:30], next run at: 2020-01-24 08:29:10 UTC)" skipped: maximum number of running instances reached (1) 2020-01-24 08:29:10,134 [166] [DEBUG] [apscheduler.scheduler] Next wakeup is due at 2020-01-24 08:29:40.132613+00:00 (in 29.998825 seconds) repomirrorworker stdout | 2020-01-24 08:29:10,133 [166] [DEBUG] [apscheduler.scheduler] Looking for jobs to run 2020-01-24 08:29:10,134 [166] [WARNING] [apscheduler.scheduler] Execution of job "_process_mirrors (trigger: interval[0:00:30], next run at: 2020-01-24 08:29:10 UTC)" skipped: maximum number of running instances reached (1) 2020-01-24 08:29:10,134 [166] [DEBUG] [apscheduler.scheduler] Next wakeup is due at 2020-01-24 08:29:40.132613+00:00 (in 29.998825 seconds)
The only message that is seen after in the logs is (an example):
2020-01-27 11:18:10,133 [166] [DEBUG] [apscheduler.scheduler] Looking for jobs to run repomirrorworker stdout | 2020-01-27 11:18:10,133 [166] [DEBUG] [apscheduler.scheduler] Looking for jobs to run 2020-01-27 11:18:10,133 [166] [WARNING] [apscheduler.scheduler] Execution of job "_process_mirrors (trigger: interval[0:00:30], next run at: 2020-01-27 11:18:10 UTC)" skipped: maximum number of running instances reached (1) 2020-01-27 11:18:10,133 [166] [DEBUG] [apscheduler.scheduler] Next wakeup is due at 2020-01-27 11:18:40.132613+00:00 (in 29.999091 seconds)
Unfortunately, we're not sure what's causing it as there appear to be no other indications of an error. Please check.