Uploaded image for project: 'Project Quay'
  1. Project Quay
  2. PROJQUAY-1323

Quay TNG Operator reconcile cause Quay DB POD was failed to start

XMLWordPrintable

    • False
    • False
    • Undefined

      Description:

      This is an new issue found after deploy Quay using V3.4 TNG Operator with disable Clair and Mirror in Quayregistry CR, then use config editor to enable Mirror, and reconfigure quay,  at this time Quay Mirror POD was not provisioning, so use "oc edit" to update Quayregistry CR to enable Clair and Mirror, the result is Quay Postgresql DB was failed to start, get error message "ERROR:  tuple already updated by self" , attached Quay Operator logs.

      Note: QE reproduced this issue two times under difference OCP namespace.

       

      lizhang@lzha-mac quay3.4 % oc get pod
      NAME                                             READY   STATUS                  RESTARTS   AGE
      quaydf1122-clair-f7fbcdcb7-px8q5                 1/1     Running                 0          52m
      quaydf1122-clair-postgres-86fbcff7-2222b         1/1     Running                 1          52m
      quaydf1122-quay-config-editor-5f7f567bf9-vpctg   1/1     Running                 0          52m
      quaydf1122-quay-database-5bb757b58c-ccxtc        0/1     CrashLoopBackOff        14         52m
      quaydf1122-quay-mirror-567cf8d475-zs6c9          0/1     Init:CrashLoopBackOff   10         52m
      quaydf1122-quay-postgres-init-l47jj              0/1     Completed               0          98m
      quaydf1122-quay-redis-5bb897f584-78b5r           1/1     Running                 0          52m
      lizhang@lzha-mac quay3.4 % oc logs quaydf1122-quay-database-5bb757b58c-ccxtc
      pg_ctl: another server might be running; trying to start server anyway
      waiting for server to start....2020-11-20 09:33:37.680 UTC [22] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
      2020-11-20 09:33:37.682 UTC [22] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
      2020-11-20 09:33:37.744 UTC [22] LOG:  redirecting log output to logging collector process
      2020-11-20 09:33:37.744 UTC [22] HINT:  Future log output will appear in directory "log".
       done
      server started
      /var/run/postgresql:5432 - accepting connections
      => sourcing /usr/share/container-scripts/postgresql/start/set_passwords.sh ...
      ERROR:  tuple already updated by self

       Quay config Bundle:

      lizhang@lzha-mac quay3.4 % cat config.yaml
      SUPER_USERS:
        - quay
        - admin
      DISTRIBUTED_STORAGE_DEFAULT_LOCATIONS:
        - default
      DISTRIBUTED_STORAGE_PREFERENCE:
        - default
      DISTRIBUTED_STORAGE_CONFIG:
        default:
          - S3Storage
          - s3_bucket: quay340
            storage_path: /quay20201120
            s3_access_key: ***
            s3_secret_key: ***
            host: s3.us-east-2.amazonaws.com
      

      Quay Registry CR:

      lizhang@lzha-mac quay3.4 % cat quayregistry.yaml
      apiVersion: quay.redhat.com/v1
      kind: QuayRegistry
      metadata:
        name: quaydf1122
      spec:
        configBundleSecret: test-config-bundle
        components:
          - kind: objectstorage
            managed: false
          - kind: clair
            managed: false
          - kind: mirror
            managed: false
      

      The following show the PODS status in each stage:

      #1: Quay deploy without Clair and Mirror
      lizhang@lzha-mac quay3.4 % oc create -f quayregistry.yaml
      quayregistry.quay.redhat.com/quaydf1122 created
      lizhang@lzha-mac quay3.4 % oc get pod
      NAME                                             READY   STATUS      RESTARTS   AGE
      quaydf1122-quay-app-8647d6c6-psljp               1/1     Running     0          8m1s
      quaydf1122-quay-config-editor-5bfc8f4f95-24tv4   1/1     Running     0          8m1s
      quaydf1122-quay-database-7d74958464-dqq9f        1/1     Running     0          9m44s
      quaydf1122-quay-postgres-init-l47jj              0/1     Completed   0          9m44s
      quaydf1122-quay-redis-95f78fcf5-twj52            1/1     Running     0          9m44s
      
      #2: With config editor to enable Mirror, App Registry, and Docker support, trigger reconfigure quay
      lizhang@lzha-mac quay3.4 % oc get pod
      NAME                                             READY   STATUS      RESTARTS   AGE
      quaydf1122-quay-app-8b677b8d6-zfn7l              1/1     Running     0          86s
      quaydf1122-quay-config-editor-6554679d95-rhbnf   1/1     Running     0          86s
      quaydf1122-quay-database-7d74958464-dqq9f        1/1     Running     0          44m
      quaydf1122-quay-postgres-init-l47jj              0/1     Completed   0          44m
      quaydf1122-quay-redis-95f78fcf5-twj52            1/1     Running     0          44m
      
      #3: With oc edit to update CR to enable Clair and Mirror
      lizhang@lzha-mac quay3.4 % oc edit quayregistry quaydf1122
      quayregistry.quay.redhat.com/quaydf1122 edited
      NAME                                             READY   STATUS             RESTARTS   AGE
      quaydf1122-clair-f7fbcdcb7-px8q5                 1/1     Running            0          11m
      quaydf1122-clair-postgres-86fbcff7-2222b         1/1     Running            1          11m
      quaydf1122-quay-config-editor-5f7f567bf9-vpctg   1/1     Running            0          11m
      quaydf1122-quay-database-5bb757b58c-ccxtc        0/1     CrashLoopBackOff   6          11m
      quaydf1122-quay-mirror-567cf8d475-zs6c9          0/1     Init:0/1           4          11m
      quaydf1122-quay-postgres-init-l47jj              0/1     Completed          0          57m
      quaydf1122-quay-redis-5bb897f584-78b5r           1/1     Running            0          11m
      lizhang@lzha-mac quay3.4 % oc logs quaydf1122-quay-database-5bb757b58c-ccxtc
      pg_ctl: another server might be running; trying to start server anyway
      waiting for server to start....2020-11-20 08:21:38.985 UTC [22] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
      2020-11-20 08:21:38.988 UTC [22] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
      2020-11-20 08:21:39.049 UTC [22] LOG:  redirecting log output to logging collector process
      2020-11-20 08:21:39.049 UTC [22] HINT:  Future log output will appear in directory "log".
       done
      server started
      /var/run/postgresql:5432 - accepting connections
      => sourcing /usr/share/container-scripts/postgresql/start/set_passwords.sh ...
      ERROR:  tuple already updated by self

      Quay Operator image/Quay Postgresql DB image:

      lizhang@lzha-mac quay3.4 % oc get pod quay-operator-87d686fcc-t9wzr -n openshift-operators -o json | jq '.spec.containers[0].image'
      "registry.redhat.io/quay/quay-rhel8-operator@sha256:1458927c89382c452b9603dd8325972e7e8a6e81230e98033cd9f5d7f4a2308c"
      
      lizhang@lzha-mac quay3.4 % oc get pod quaydf1122-quay-database-5bb757b58c-ccxtc -o json | jq '.spec.containers[0].image' "registry.redhat.io/rhel8/postgresql-10@sha256:612e867d9e2b2be4cd6787b54e20c3c128471d725f44ccaf60d8806f8bfa5de8"
      

       Steps:

      1. Deploy Quay V3.4 TNG Operator
      2. Create Quay Registry CR with disable Clair and Mirror, and use AWS S3 as external registry storage
      3. Login Quay and create new image repository, organization and robot account
      4. Open Quay config editor to enable Mirror, App Registry Docker Support
      5. Click Reconfigure Quay
      6. Wait until new Quay POD is ready and old Quay POD is terminated successfully, run "oc edit quayregistry quaydf1122", update "managed clair to true", "managed mirror to true", save the change
      7. Waiting for TNG Operator to reconcile
      8. Check all POD status

      Expected Results:

      All PODs are in ready status, including Quay, Quay DB, Clair, Mirror, etc.

      Actual Results:

      Quay DB POD was failed to start.

       

              rhn-coreos-amerdler Alec Merdler (Inactive)
              lzha1981 luffy zhang
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: