-
Bug
-
Resolution: Unresolved
-
Critical
-
None
-
quay-v3.10.0
-
False
-
None
-
False
-
Quay Enterprise
-
-
-
0
When QuayRegistry is being deployed, the operator immediately creates both the app upgrade job and the database deployment. In certain cases, the job will start populating the database before database reports as ready and will be killed mid migration by kubelet and restarted. Since the db is partially populated, subsequent migrations will fail. Example:
~/openshift-4/quay-reproducer# oc get pods NAME READY STATUS RESTARTS AGE quay-clair-postgres-84999868bb-ll99c 1/1 Running 0 4m38s quay-quay-app-upgrade-85x9x 0/1 CrashLoopBackOff 5 (90s ago) 4m39s quay-quay-database-6fcd4c4b5b-w7sw2 1/1 Running 0 4m37s quay-quay-mirror-769458bbd-n5znm 0/1 Init:CrashLoopBackOff 5 (107s ago) 4m38s quay-quay-mirror-769458bbd-tkc8g 0/1 Init:CrashLoopBackOff 5 (111s ago) 4m38s quay-quay-redis-7f58874b5d-lk8qr 1/1 Running 0 4m39s ~/openshift-4/quay-reproducer# oc logs quay-qpay-app-upgrade-85x9x ... Entering migration mode to version: head 21:00:18 INFO [alembic.runtime.migration] Context impl PostgresqlImpl. 21:00:18 INFO [alembic.runtime.migration] Will assume non-transactional DDL. 21:00:18 INFO [alembic.runtime.migration] Running upgrade e2894a3a3c19 -> 7a525c68eb13, Add OCI/App models. Traceback (most recent call last): File "/app/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1910, in _execute_context self.dialect.do_execute( File "/app/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 736, in do_execute cursor.execute(statement, parameters) psycopg2.errors.DuplicateTable: relation "tagkind" already exists The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/app/bin/alembic", line 8, in <module> sys.exit(main()) ... File "/app/lib/python3.9/site-packages/sqlalchemy/util/compat.py", line 211, in raise_ raise exception File "/app/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1910, in _execute_context self.dialect.do_execute( File "/app/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 736, in do_execute cursor.execute(statement, parameters) sqlalchemy.exc.ProgrammingError: (psycopg2.errors.DuplicateTable) relation "tagkind" already exists [SQL: CREATE TABLE tagkind ( id SERIAL NOT NULL, name VARCHAR(255) NOT NULL, CONSTRAINT pk_tagkind PRIMARY KEY (id) ) ] (Background on this error at: https://sqlalche.me/e/14/f405)
The only recourse here is to restart the whole procedure again.
The expectation is that the migration job is not started until database reports as ready. Only then should the job be created by the operator.