Uploaded image for project: 'Project Quay'
  1. Project Quay
  2. PROJQUAY-206

Repo mirroring sometimes locks up

XMLWordPrintable

    • Utapau [Quay 188]
    • 0

      Client is using 3 mirror workers in their setup. All mirror workers sometimes lock up and mirroring stops on all repos, no matter how many actually are in the queue. In client's case, this happened in a window of 2 hours, 3 mirror instances stopped mirroring without any error indication:

      1st instance:

      2020-01-24 10:29:00,743 [166] [DEBUG] [util.repomirror.skopeomirror] Skopeo [STDERR]: time="2020-01-24T10:27:57Z" level=debug msg="GET https://auth.docker.io/token?scope=repository%3Alibrary%2Fphp%3Apull&service=registry.docker.io" 
      
      2020-01-24 10:29:00,743 [166] [DEBUG] [util.repomirror.skopeomirror] Skopeo [STDOUT]:         "5-fpm-alpine3.4",
      
      2020-01-24 10:29:00,744 [166] [DEBUG] [util.repomirror.skopeomirror] Skopeo [STDERR]: time="2020-01-24T10:27:57Z" level=debug msg="GET https://registry-1.docker.io/v2/library/php/tags/list" 
      
      2020-01-24 10:29:00,744 [166] [DEBUG] [util.repomirror.skopeomirror] Skopeo [STDOUT]:         "5-fpm-alpine3.7",
      
      2020-01-24 10:29:19,724 [166] [DEBUG] [apscheduler.scheduler] Looking for jobs to run
      repomirrorworker stdout | 2020-01-24 10:29:19,724 [166] [DEBUG] [apscheduler.scheduler] Looking for jobs to run
      2020-01-24 10:29:19,725 [166] [WARNING] [apscheduler.scheduler] Execution of job "_process_mirrors (trigger: interval[0:00:30], next run at: 2020-01-24 10:29:19 UTC)" skipped: maximum number of running instances reached (1)
      2020-01-24 10:29:19,725 [166] [DEBUG] [apscheduler.scheduler] Next wakeup is due at 2020-01-24 10:29:49.724230+00:00 (in 29.999056 seconds)
      repomirrorworker stdout | 2020-01-24 10:29:19,725 [166] [WARNING] [apscheduler.scheduler] Execution of job "_process_mirrors (trigger: interval[0:00:30], next run at: 2020-01-24 10:29:19 UTC)" skipped: maximum number of running instances reached (1)
      

      2nd instance:

      2020-01-24 12:10:14,126 [166] [DEBUG] [util.repomirror.skopeomirror] Skopeo [STDERR]: time="2020-01-24T12:10:11Z" level=debug msg="GET https://auth.docker.io/token?scope=repository%3Ahasura%2Fgraphql-engine%3Apull&service=registry.docker.io" 
      
      2020-01-24 12:10:14,126 [166] [DEBUG] [util.repomirror.skopeomirror] Skopeo [STDOUT]:         "pg-12-tests-304abecc",
      
      2020-01-24 12:10:14,126 [166] [DEBUG] [util.repomirror.skopeomirror] Skopeo [STDERR]: time="2020-01-24T12:10:12Z" level=debug msg="GET https://registry-1.docker.io/v2/hasura/graphql-engine/tags/list" 
      
      2020-01-24 12:10:14,126 [166] [DEBUG] [util.repomirror.skopeomirror] Skopeo [STDOUT]:         "pull1003-3345622",
      
      2020-01-24 12:10:41,149 [166] [DEBUG] [apscheduler.scheduler] Looking for jobs to run
      2020-01-24 12:10:41,149 [166] [WARNING] [apscheduler.scheduler] Execution of job "_process_mirrors (trigger: interval[0:00:30], next run at: 2020-01-24 12:10:41 UTC)" skipped: maximum number of running instances reached (1)
      2020-01-24 12:10:41,149 [166] [DEBUG] [apscheduler.scheduler] Next wakeup is due at 2020-01-24 12:11:11.148209+00:00 (in 29.998826 seconds)
      repomirrorworker stdout | 2020-01-24 12:10:41,149 [166] [DEBUG] [apscheduler.scheduler] Looking for jobs to run
      2020-01-24 12:10:41,149 [166] [WARNING] [apscheduler.scheduler] Execution of job "_process_mirrors (trigger: interval[0:00:30], next run at: 2020-01-24 12:10:41 UTC)" skipped: maximum number of running instances reached (1)
      2020-01-24 12:10:41,149 [166] [DEBUG] [apscheduler.scheduler] Next wakeup is due at 2020-01-24 12:11:11.148209+00:00 (in 29.998826 seconds)

      3rd instance:

      2020-01-24 08:29:09,931 [166] [DEBUG] [util.repomirror.skopeomirror] Skopeo [STDERR]: time="2020-01-24T08:29:07Z" level=debug msg="GET https://auth.docker.io/token?scope=repository%3Ahasura%2Fgraphql-engine%3Apull&service=registry.docker.io" 
      
      2020-01-24 08:29:09,931 [166] [DEBUG] [util.repomirror.skopeomirror] Skopeo [STDOUT]:         "pg-12-tests-304abecc",
      
      2020-01-24 08:29:09,931 [166] [DEBUG] [util.repomirror.skopeomirror] Skopeo [STDERR]: time="2020-01-24T08:29:08Z" level=debug msg="GET https://registry-1.docker.io/v2/hasura/graphql-engine/tags/list" 
      
      2020-01-24 08:29:09,931 [166] [DEBUG] [util.repomirror.skopeomirror] Skopeo [STDOUT]:         "pull1003-3345622",
      
      2020-01-24 08:29:10,133 [166] [DEBUG] [apscheduler.scheduler] Looking for jobs to run
      2020-01-24 08:29:10,134 [166] [WARNING] [apscheduler.scheduler] Execution of job "_process_mirrors (trigger: interval[0:00:30], next run at: 2020-01-24 08:29:10 UTC)" skipped: maximum number of running instances reached (1)
      2020-01-24 08:29:10,134 [166] [DEBUG] [apscheduler.scheduler] Next wakeup is due at 2020-01-24 08:29:40.132613+00:00 (in 29.998825 seconds)
      repomirrorworker stdout | 2020-01-24 08:29:10,133 [166] [DEBUG] [apscheduler.scheduler] Looking for jobs to run
      2020-01-24 08:29:10,134 [166] [WARNING] [apscheduler.scheduler] Execution of job "_process_mirrors (trigger: interval[0:00:30], next run at: 2020-01-24 08:29:10 UTC)" skipped: maximum number of running instances reached (1)
      2020-01-24 08:29:10,134 [166] [DEBUG] [apscheduler.scheduler] Next wakeup is due at 2020-01-24 08:29:40.132613+00:00 (in 29.998825 seconds)
      

      The only message that is seen after in the logs is (an example):

      2020-01-27 11:18:10,133 [166] [DEBUG] [apscheduler.scheduler] Looking for jobs to run
      repomirrorworker stdout | 2020-01-27 11:18:10,133 [166] [DEBUG] [apscheduler.scheduler] Looking for jobs to run
      2020-01-27 11:18:10,133 [166] [WARNING] [apscheduler.scheduler] Execution of job "_process_mirrors (trigger: interval[0:00:30], next run at: 2020-01-27 11:18:10 UTC)" skipped: maximum number of running instances reached (1)
      2020-01-27 11:18:10,133 [166] [DEBUG] [apscheduler.scheduler] Next wakeup is due at 2020-01-27 11:18:40.132613+00:00 (in 29.999091 seconds)
      

      Unfortunately, we're not sure what's causing it as there appear to be no other indications of an error. Please check.

            tomckay@redhat.com Thomas Mckay
            rhn-support-ibazulic Ivan Bazulic
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: