Uploaded image for project: 'Hybrid Cloud Console'
  1. Hybrid Cloud Console
  2. RHCLOUD-23753

Clowder returns incorrect not-ready status, blocking deployments

XMLWordPrintable

    • False
    • Hide

      None

      Show
      None
    • False
    • Unset
    • No
    • Blocker

      Clowder in stage appears is misrepresenting the state of a deployment in cloudigrade-stage. Here it says it's not ready:

      ❯ oc get ClowdApp/postigrade
      NAME         READY   MANAGED   ENVNAME   AGE
      postigrade   0       1         stage     442d

      But here it says it's ready:

      ❯ oc get ClowdApp/postigrade -o json | jq .status.ready
      true

      This is blocking cloudigrade's release process. The qontract reconcile tekton pipeline appears to be stuck in a loop waiting forever for the ready status:

      [2023-01-11 20:17:21] [INFO] [openshift_base.py:validate_realized_data:868] - ['validating', 'crcs02ue1', 'cloudigrade-stage', 'ClowdApp', 'postigrade']
      [2023-01-11 20:17:21] [INFO] [openshift_base.py:validate_realized_data:928] - ClowdApp has deployments that are not ready (0 ready / 1 total)

      psavage@redhat.com confirmed in Slack that this is a bug in Clowder:

      So yeh - we made a fix a few weeks back which only wrote the status of a resource when there was an actual change. The problem with this was the routine that detects if there has been a change didn't know about the deployment stats changes we made.

      Slack discussion thread: https://redhat-internal.slack.com/archives/C022YV4E0NA/p1673468424641509

      Pete's initial proposed fix: https://github.com/RedHatInsights/clowder/pull/752

            psavage@redhat.com Peter Savage
            bradsmith_rh Brad Smith
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: