Uploaded image for project: 'Red Hat Developer Hub Bugs'
  1. Red Hat Developer Hub Bugs
  2. RHDHBUGS-2036

sonataflow di/jobs pods remain in CrashLoopBackOff state for 10-15 minutes at initial deploy

XMLWordPrintable

    • 2
    • False
    • Hide

      None

      Show
      None
    • False
    • Hide
      = Resolved SonataFlow Pod Crash Issue

      In the new release, a timing problem during the RHDH 1.7 installation with Orchestrator plugins, affecting SonataFlow database provisioning, has been addressed. This issue caused SonataFlow pods to repeatedly enter the `CrashLoopBackOff` state, leading to delays and potential confusion for users. With this update, SonataFlow pods no longer encounter the `CrashLoopBackOff` state due to the database provisioning delay. This improvement enhances the user experience, as SonataFlow pods now start promptly, eliminating unnecessary wait times.
      Show
      = Resolved SonataFlow Pod Crash Issue In the new release, a timing problem during the RHDH 1.7 installation with Orchestrator plugins, affecting SonataFlow database provisioning, has been addressed. This issue caused SonataFlow pods to repeatedly enter the `CrashLoopBackOff` state, leading to delays and potential confusion for users. With this update, SonataFlow pods no longer encounter the `CrashLoopBackOff` state due to the database provisioning delay. This improvement enhances the user experience, as SonataFlow pods now start promptly, eliminating unnecessary wait times.
    • Bug Fix
    • Done
    • RHDH Install 3280
    • Critical

      Description of problem:

      During the installation of the Red Hat Developer Hub (RHDH) 1.7 operator with Orchestrator plugins enabled, the sonataflow-data-index and sonataflow-jobs-service pods repeatedly enter a CrashLoopBackOff state for approximately 10 to 15 minutes.

      This issue occurs specifically when the SonataFlowPlatform custom resource is generated as a dependency of the dynamic plugins. The root cause appears to be a timing issue: the SonataFlow database is not provisioned until the primary Backstage pod is running. This leaves the dependent SonataFlow pods unable to connect to their database upon initialization and stuck clbo, until after the final retry after the backstage pod has stabilized. 

       

      The ux is the primary concern here, because 10-15 minutes of CLBO give the perception of a problem and is long enough time that an end user may take action that is not required.

       

      Prerequisites (if any, like setup, operators/versions):

      Steps to Reproduce

      • Deploy RHDH operator 1.7 / backstage resource
      • Enable orchestrator plugins with sonataflow dependency
          dependencies:
            - ref: sonataflow
      • Monitor sonataflow di/jobs and backstage pods

      Actual results:

      Sonataflow pods remain clbo for 10-15 minutes - it gives the appearance that something is wrong

      Expected results:

      Sonataflow pods do not get into clbo, or at least get into clbo briefly until they begin running. 

      Reproducibility (Always/Intermittent/Only Once):

      100%

      Build Details:

      Additional info (Such as Logs, Screenshots, etc):

              gazarenk-1 Gennady Azarenkov
              chadcrum Chad Crum
              RHIDP - Install
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: