Uploaded image for project: 'Red Hat 3scale API Management'
  1. Red Hat 3scale API Management
  2. THREESCALE-11508

Applications being duplicated when 3scale Operator fails to reconcile Product

XMLWordPrintable

    • RHOAM Sprint 67, RHOAM Sprint 68, RHOAM Sprint 69

      Issue description:

      Application CR has 2 issues as it stands:

      1. ApplicationCR cannot be created with suspend: true. This causes an indefinite loop of creating applications and hitting a nil pointer in operator code causing a restart which repeats the process.
      The nil pointer is triggered here: https://github.com/3scale/3scale-operator/blob/master/controllers/capabilities/applications.go#L49

      Proposed solution:
      Operator should update the application CR as soon as the create application call is made here: https://github.com/3scale/3scale-operator/blob/master/controllers/capabilities/application_threescale_reconciler.go#L84 with annotation "applicationID". Also, the operator should check for the annotation presence on the CR and compare it against the existing 3scale application IDs when confirming that application exists in 3scale. If not, re-create it AND update both, annotation and status.applicationID (which doesn't happen currently)

      2. Application CR causes replication of application if status.applicationID has no representation in 3scale. So if application is created by CR, and deleted in UI/API, the replication starts happening because as it stands, for operator to confirm that application exists in 3scale, it compares the status.applicationID with application IDs found in 3scale, if not found, it creates the application but never updates the status.applicaitonID field.
      This means that on every reconcile, operator thinks the old applicationID should be in 3scale, while it previously already confirmed that it's not there and created new app.

      Solution:
      Operator must always update the status.applicationID if it created a new application on behalf of the CR. Plus, consider point for additional confirmation.

      *Below is the original issue description from the source Jira: https://issues.redhat.com/browse/THREESCALE-11014*

      Issue description:

      In 3scale 2.14, if 3scale Operator (tested with version 0.11.11) is uninstalled and then reinstalled but is unable to connect to the Admin Portal due to a self-signed or invalid certificate, it will cause any application CRs to enter an error state with the applicationID field removed from the Application CR. Thus, after correcting the certificate issue in the operator, it will ultimately create a duplicate application .

      How to reproduce:

      1. Create a [^developersecret.yaml], [^developeruser.yaml], [^developeraccount.yaml], [^product.yaml] and a [^application.yaml];
      2. Wait for operator to sync all CR's, and check the assigned applicationID of application:
      $ oc get application/testapplication -o yaml
      apiVersion: capabilities.3scale.net/v1beta1
      kind: Application
      ...
      status:
        applicationID: 10
        conditions:
        - lastTransitionTime: "2024-05-01T17:08:36Z"
          status: "True"
          type: Ready
        observedGeneration: 2
        providerAccountHost: https://3scale-admin.apps.example.com
        state: live
      
      1. Checking in Admin Portal, we can see the created application in the product:
      2. Remove and reinstall the 3scale Operator;
      3. Wait for operator to start and check the application status:
      $ oc get application/testapplication -o yaml
      apiVersion: capabilities.3scale.net/v1beta1
      kind: Application
      ...
      status:
        conditions:
        - lastTransitionTime: "2024-05-01T17:16:30Z"
          message: 'spec.productCRName: Invalid value: v1.LocalObjectReference{Name:"testproduct"}:
            productCR name doesnt have a valid product reference'
          status: "False"
          type: Ready
        observedGeneration: 2
      
      1. Note that the applicationID vanished from application CR;
      2. Reinject admin portal certificate to the 3scale Operator to allow the operator to manage the custom resources;
      3. Wait for the operator to reconcile the custom resources and check Application CR again:
      $ oc get application/testapplication -o yaml
      apiVersion: capabilities.3scale.net/v1beta1
      kind: Application
      ...
      status:
        applicationID: 11
        conditions:
        - lastTransitionTime: "2024-05-01T17:18:11Z"
          status: "True"
          type: Ready
        observedGeneration: 2
        providerAccountHost: https://3scale-admin.apps.example.com
        state: live
      1. Note that the applicationID changed from 10 to 11;
      2. Checking the Admin Portal, we can see the duplicated application:

      Workaround:

      If the Custom Resource objects has the "insecure_skip_verify: true" in the metadata -> annotations, the Operator will not remove the applicationID from the Application CR and will work fine.

              Unassigned Unassigned
              mstoklus_rhmi Michal Stokluska
              Michal Stokluska Michal Stokluska
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: