Details
-
Bug
-
Resolution: Duplicate
-
Critical
-
None
-
None
-
False
-
False
-
Quay Enterprise
-
Undefined
-
0
Description
The operator is stuck in "Running database migrations" part:
Name: sfit-registry Namespace: sfit-quay Labels: <none> Annotations: <none> API Version: quay.redhat.com/v1 Kind: QuayRegistry ... Status: Conditions: Last Transition Time: 2021-06-10T14:47:14Z Last Update Time: 2021-06-10T14:47:14Z Message: running database migrations Reason: MigrationsInProgress Status: False Type: Available
We saw that the upgrade pod was never rescaled back so we manually scaled it to 1, we hoped that that would resolve the lock. Unfortunately, because the upgrade pod runs all workers, it has gone onto a loop scanning images (we've seen this behaviour with another client):
namespacegcworker stdout | 2021-07-13 11:25:00,257 [93] [INFO] [apscheduler.executors.default] Running job "QueueWorker.poll_queue (trigger: interval[0:01:00], next run at: 2021-07-13 11:26:00 UTC)" (scheduled at 2021-07-13 11:25:00.257427+00:00) namespacegcworker stdout | 2021-07-13 11:25:00,301 [93] [INFO] [apscheduler.executors.default] Job "QueueWorker.poll_queue (trigger: interval[0:01:00], next run at: 2021-07-13 11:26:00 UTC)" executed successfully securityscanningnotificationworker stdout | 2021-07-13 11:25:03,359 [108] [INFO] [apscheduler.executors.default] Running job "QueueWorker.run_watchdog (trigger: interval[0:01:00], next run at: 2021-07-13 11:26:03 UTC)" (scheduled at 2021-07-13 11:25:03.358899+00:00) securityscanningnotificationworker stdout | 2021-07-13 11:25:03,359 [108] [INFO] [apscheduler.executors.default] Job "QueueWorker.run_watchdog (trigger: interval[0:01:00], next run at: 2021-07-13 11:26:03 UTC)" executed successfully notificationworker stdout | 2021-07-13 11:25:06,068 [95] [INFO] [apscheduler.executors.default] Running job "QueueWorker.poll_queue (trigger: interval[0:00:10], next run at: 2021-07-13 11:25:16 UTC)" (scheduled at 2021-07-13 11:25:06.067696+00:00) notificationworker stdout | 2021-07-13 11:25:06,109 [95] [INFO] [apscheduler.executors.default] Job "QueueWorker.poll_queue (trigger: interval[0:00:10], next run at: 2021-07-13 11:25:16 UTC)" executed successfully securityworker stdout | 2021-07-13 11:25:06,464 [109] [INFO] [apscheduler.executors.default] Running job "SecurityWorker._index_in_scanner (trigger: interval[0:00:30], next run at: 2021-07-13 11:25:36 UTC)" (scheduled at 2021-07-13 11:25:06.463601+00:00) securityworker stdout | 2021-07-13 11:25:06,534 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 150-186 securityworker stdout | 2021-07-13 11:25:06,537 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 247-283 securityworker stdout | 2021-07-13 11:25:06,540 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 192-228 securityworker stdout | 2021-07-13 11:25:06,543 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 186-222 securityworker stdout | 2021-07-13 11:25:06,545 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 228-264 securityworker stdout | 2021-07-13 11:25:06,554 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 336-372 gcworker stdout | 2021-07-13 11:25:06,559 [84] [INFO] [apscheduler.executors.default] Running job "GarbageCollectionWorker._garbage_collection_repos (trigger: interval[0:00:30], next run at: 2021-07-13 11:25:36 UTC)" (scheduled at 2021-07-13 11:25:06.558679+00:00) securityworker stdout | 2021-07-13 11:25:06,565 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 289-325 securityworker stdout | 2021-07-13 11:25:06,568 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 325-361 securityworker stdout | 2021-07-13 11:25:06,570 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 283-319 securityworker stdout | 2021-07-13 11:25:06,573 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 71-107 securityworker stdout | 2021-07-13 11:25:06,585 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 108-144 securityworker stdout | 2021-07-13 11:25:06,588 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 28-64 securityworker stdout | 2021-07-13 11:25:06,590 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 144-180 securityworker stdout | 2021-07-13 11:25:06,593 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 107-143 securityworker stdout | 2021-07-13 11:25:06,596 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 3-39 securityworker stdout | 2021-07-13 11:25:06,598 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 372-393 securityworker stdout | 2021-07-13 11:25:06,601 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 64-100 securityworker stdout | 2021-07-13 11:25:06,601 [109] [INFO] [util.migrate.allocator] No more work securityworker stdout | 2021-07-13 11:25:06,604 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 239-275 securityworker stdout | 2021-07-13 11:25:06,607 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 104-140 securityworker stdout | 2021-07-13 11:25:06,610 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 166-202 securityworker stdout | 2021-07-13 11:25:06,613 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 202-238 securityworker stdout | 2021-07-13 11:25:06,616 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 30-66 securityworker stdout | 2021-07-13 11:25:06,618 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 238-274 securityworker stdout | 2021-07-13 11:25:06,621 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 67-103 securityworker stdout | 2021-07-13 11:25:06,624 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 3-39 securityworker stdout | 2021-07-13 11:25:06,628 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 140-176 securityworker stdout | 2021-07-13 11:25:06,631 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 66-102 securityworker stdout | 2021-07-13 11:25:06,634 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 324-360 securityworker stdout | 2021-07-13 11:25:06,637 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 288-324 securityworker stdout | 2021-07-13 11:25:06,640 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 103-139 securityworker stdout | 2021-07-13 11:25:06,643 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 360-393 securityworker stdout | 2021-07-13 11:25:06,646 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 275-311 securityworker stdout | 2021-07-13 11:25:06,646 [109] [INFO] [util.migrate.allocator] No more work securityworker stdout | 2021-07-13 11:25:06,650 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 242-278 gcworker stdout | 2021-07-13 11:25:06,650 [84] [INFO] [apscheduler.executors.default] Job "GarbageCollectionWorker._garbage_collection_repos (trigger: interval[0:00:30], next run at: 2021-07-13 11:25:36 UTC)" executed successfully securityworker stdout | 2021-07-13 11:25:06,652 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 284-320 securityworker stdout | 2021-07-13 11:25:06,655 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 278-314 securityworker stdout | 2021-07-13 11:25:06,658 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 106-142 securityworker stdout | 2021-07-13 11:25:06,661 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 6-42 securityworker stdout | 2021-07-13 11:25:06,664 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 3-39 securityworker stdout | 2021-07-13 11:25:06,700 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 155-191 securityworker stdout | 2021-07-13 11:25:06,707 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 52-88 securityworker stdout | 2021-07-13 11:25:06,710 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 88-124 securityworker stdout | 2021-07-13 11:25:06,724 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 42-78 securityworker stdout | 2021-07-13 11:25:06,727 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 206-242 securityworker stdout | 2021-07-13 11:25:06,730 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 191-227 securityworker stdout | 2021-07-13 11:25:06,751 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 142-178 securityworker stdout | 2021-07-13 11:25:06,755 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 348-384 securityworker stdout | 2021-07-13 11:25:06,758 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 320-356 securityworker stdout | 2021-07-13 11:25:06,761 [109] [INFO] [util.migrate.allocator] No candidates, marking entire block completed 384-393 securityworker stdout | 2021-07-13 11:25:06,761 [109] [INFO] [util.migrate.allocator] No more work securityworker stdout | 2021-07-13 11:25:06,761 [109] [INFO] [apscheduler.executors.default] Job "SecurityWorker._index_in_scanner (trigger: interval[0:00:30], next run at: 2021-07-13 11:25:36 UTC)" executed successfully buildlogsarchiver stdout | 2021-07-13 11:25:07,256 [79] [INFO] [apscheduler.executors.default] Running job "ArchiveBuildLogsWorker._archive_redis_buildlogs (trigger: interval[0:00:30], next run at: 2021-07-13 11:25:37 UTC)" (scheduled at 2021-07-13 11:25:07.255435+00:00) buildlogsarchiver stdout | 2021-07-13 11:25:07,303 [79] [INFO] [apscheduler.executors.default] Job "ArchiveBuildLogsWorker._archive_redis_buildlogs (trigger: interval[0:00:30], next run at: 2021-07-13 11:25:37 UTC)" executed successfully exportactionlogsworker stdout | 2021-07-13 11:25:13,363 [83] [INFO] [apscheduler.executors.default] Running job "QueueWorker.run_watchdog (trigger: interval[0:01:00], next run at: 2021-07-13 11:26:13 UTC)" (scheduled at 2021-07-13 11:25:13.362870+00:00) exportactionlogsworker stdout | 2021-07-13 11:25:13,363 [83] [INFO] [apscheduler.executors.default] Job "QueueWorker.run_watchdog (trigger: interval[0:01:00], next run at: 2021-07-13 11:26:13 UTC)" executed successfully securityscanningnotificationworker stdout | 2021-07-13 11:25:13,856 [108] [INFO] [apscheduler.executors.default] Running job "QueueWorker.poll_queue (trigger: interval[0:01:00], next run at: 2021-07-13 11:26:13 UTC)" (scheduled at 2021-07-13 11:25:13.855426+00:00) securityscanningnotificationworker stdout | 2021-07-13 11:25:13,911 [108] [INFO] [apscheduler.executors.default] Job "QueueWorker.poll_queue (trigger: interval[0:01:00], next run at: 2021-07-13 11:26:13 UTC)" executed successfully gunicorn-web stdout | 2021-07-13 11:25:14,042 [244] [WARNING] [py.warnings] /usr/local/lib/python3.8/site-packages/urllib3/connectionpool.py:997: InsecureRequestWarning: Unverified HTTPS request is being made to host 'localhost'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings gunicorn-web stdout | warnings.warn( nginx stdout | 127.0.0.1 () - - [13/Jul/2021:11:25:14 +0000] "GET /v1/_internal_ping HTTP/1.1" 200 4 "-" "python-requests/2.22.0" (0.002 162 0.001) gunicorn-registry stdout | 2021-07-13 11:25:14,045 [240] [INFO] [gunicorn.access] 127.0.0.1 - - [13/Jul/2021:11:25:14 +0000] "GET /v1/_internal_ping HTTP/1.1" 200 4 "-" "python-requests/2.22.0" gunicorn-web stdout | 2021-07-13 11:25:14,050 [244] [WARNING] [py.warnings] /usr/local/lib/python3.8/site-packages/urllib3/connectionpool.py:997: InsecureRequestWarning: Unverified HTTPS request is being made to host 'localhost'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings gunicorn-web stdout | warnings.warn( nginx stdout | 127.0.0.1 () - - [13/Jul/2021:11:25:14 +0000] "GET /_internal_ping HTTP/1.1" 200 4 "-" "python-requests/2.22.0" (0.002 159 0.001) gunicorn-web stdout | 2021-07-13 11:25:14,052 [246] [INFO] [gunicorn.access] 127.0.0.1 - - [13/Jul/2021:11:25:14 +0000] "GET /_internal_ping HTTP/1.0" 200 4 "-" "python-requests/2.22.0" gunicorn-web stdout | 2021-07-13 11:25:14,053 [244] [INFO] [data.database] Connection pooling disabled for postgresql
Scaling back the upgrade pod deployment to 0 did nothing, the operator is still stuck in the same message:
I0714 06:23:37.310435 1 request.go:655] Throttling request took 1.034055947s, request: GET:https://172.19.192.1:443/apis/template.openshift.io/v1?timeout=32s 2021-07-14T06:23:39.967Z INFO controller-runtime.metrics metrics server is starting to listen {"addr": ":8080"} 2021-07-14T06:23:39.971Z INFO controller-runtime.injectors-warning Injectors are deprecated, and will be removed in v0.10.x 2021-07-14T06:23:39.971Z INFO controller-runtime.injectors-warning Injectors are deprecated, and will be removed in v0.10.x 2021-07-14T06:23:39.971Z INFO controller-runtime.injectors-warning Injectors are deprecated, and will be removed in v0.10.x 2021-07-14T06:23:39.971Z INFO controller-runtime.injectors-warning Injectors are deprecated, and will be removed in v0.10.x 2021-07-14T06:23:39.971Z INFO setup starting server on port 7071 2021-07-14T06:23:39.971Z INFO setup starting manager 2021-07-14T06:23:39.971Z INFO controller-runtime.manager starting metrics server {"path": "/metrics"} 2021-07-14T06:23:39.972Z INFO controller-runtime.manager.controller.quayregistry Starting EventSource {"reconciler group": "quay.redhat.com", "reconciler kind": "QuayRegistry", "source": "kind source: /, Kind="} 2021-07-14T06:23:39.972Z INFO controller-runtime.manager.controller.quayecosystem Starting EventSource {"reconciler group": "redhatcop.redhat.io", "reconciler kind": "QuayEcosystem", "source": "kind source: /, Kind="} 2021-07-14T06:23:40.072Z INFO controller-runtime.manager.controller.quayregistry Starting Controller {"reconciler group": "quay.redhat.com", "reconciler kind": "QuayRegistry"} 2021-07-14T06:23:40.072Z INFO controller-runtime.manager.controller.quayregistry Starting workers {"reconciler group": "quay.redhat.com", "reconciler kind": "QuayRegistry", "worker count": 1} 2021-07-14T06:23:40.072Z INFO controller-runtime.manager.controller.quayecosystem Starting Controller {"reconciler group": "redhatcop.redhat.io", "reconciler kind": "QuayEcosystem"} 2021-07-14T06:23:40.072Z INFO controller-runtime.manager.controller.quayecosystem Starting workers {"reconciler group": "redhatcop.redhat.io", "reconciler kind": "QuayEcosystem", "worker count": 1} 2021-07-14T06:23:40.072Z INFO controllers.QuayRegistry begin reconcile {"quayregistry": "sfit-quay/sfit-registry"} 2021-07-14T06:23:40.072Z INFO controllers.QuayRegistry migrations in progress, skipping reconcile {"quayregistry": "sfit-quay/sfit-registry"}
The question from the client is: can we somehow either remove finalizers or make the operator progress manually? There should be no db migrations anyway since it's an upgrade from 3.5.1 to 3.5.3 and there are no db migrations between z-stream versions.