Uploaded image for project: 'FlightPath'
  1. FlightPath
  2. FLPATH-2644

500 Internal Server Error with 404 Not Found when aborting User Onboarding workflow after "Run again"

XMLWordPrintable

    • False
    • Hide

      None

      Show
      None
    • False

      Complete Findings with Screenshots and Technical Details

      Error Reproduction

      Successfully reproduced the 500/404 error using automated Playwright tests with the following approach:

      1. Initial workflow abort: ✅ 200 OK - Successfully aborted
      2. "Run again" workflow abort: ❌ 500 Internal Server Error with 404 Not Found

      Test Execution Details

      API Interception Results

      The test intercepted and validated all abort API calls:

      Attempt 1 (Initial abort): ✅ 200 OK

      • Response: "Workflow instance {id} successfully aborted"

      Attempt 2 (After "Run again"): ❌ 500 Internal Server Error

      Key Technical Insights

      1. Timing Factor: The error occurs consistently on the second attempt (after 2-second wait) but not on subsequent attempts with longer waits
      2. State Management Issue: The internal service appears to lose track of workflow instances when using "Run again"
      3. Intermittent Nature: The error is reproducible but not 100% consistent, suggesting a race condition or timing dependency

      API Response Data from Test Execution

      Successful Abort Response (200 OK):

      Request URL: https://backstage-backstage-rhdh-operator.apps.ocp-edge73-0.qe.lab.redhat.com/api/orchestrator/v2/workflows/instances/{id}/abort
      Request Method: DELETE
      Status Code: 200 OK
      Response Body: "Workflow instance {id} successfully aborted"
      

      Failed Abort Response (500 Internal Server Error):

      {
          "error": {
              "name": "Error",
              "message": "HTTP DELETE request to http://user-onboarding.rhdh-operator/management/processes/user-onboarding/instances/df1eacdc-cba3-48f8-adc9-8ac20e65eeb7 failed.\nStatus Code: 404\nStatus Text: Not Found"
          },
          "request": {
              "method": "DELETE",
              "url": "/v2/workflows/instances/df1eacdc-cba3-48f8-adc9-8ac20e65eeb7/abort"
          },
          "response": {
              "statusCode": 500
          }
      }
      

      Network Request Details

      Abort API Call Pattern:

      Error Reproduction Pattern

      Consistent Failure Point:

      • Attempt 1: ✅ 200 OK (immediate abort)
      • Attempt 2: ❌ 500 Internal Server Error (after 2s wait + "Run again")

      This pattern confirms the issue is specifically related to the "Run again" functionality and timing of the abort attempt.

      Test Code Highlights

      The test successfully:

      • Navigates to User Onboarding workflow
      • Fills out workflow form fields
      • Starts workflow execution
      • Aborts running workflow
      • Validates abort success
      • Uses "Run again" button
      • Repeats the entire flow multiple times
      • Intercepts and validates all API responses

      Screenshots and Evidence

      The test captured multiple screenshots during execution:

      • Workflow form completion
      • Workflow running state
      • Abort confirmation dialogs
      • Aborted workflow status
      • "Run again" button visibility
      • Error states and confirmations

      Root Cause Analysis

      The 404 error suggests that when using "Run again", the internal service user-onboarding.rhdh-operator cannot locate the workflow instance, possibly due to:

      • Instance ID mismatch between orchestrator and internal service
      • State synchronization issues between services
      • Timing-related race conditions in instance registration
      • Database or cache inconsistencies

      Impact Assessment

      • User Experience: Users cannot abort workflows started via "Run again"
      • Workflow Management: Breaks the expected workflow lifecycle
      • Reliability: Creates inconsistent behavior that undermines user trust
      • Support: Generates support tickets and user confusion

       

        1. 0-trace.network
          862 kB
        2. 0-trace.stacks
          2 kB
        3. 0-trace.trace
          2.36 MB
        4. 404_error_encountered_attempt_1.png
          404_error_encountered_attempt_1.png
          152 kB
        5. abort_confirmation_modal_new_workflow_attempt_1.png
          abort_confirmation_modal_new_workflow_attempt_1.png
          160 kB
        6. abort_confirmation_modal.png
          abort_confirmation_modal.png
          109 kB
        7. abort_confirmed_in_modal_new_workflow_attempt_1.png
          abort_confirmed_in_modal_new_workflow_attempt_1.png
          149 kB
        8. abort_confirmed_in_modal.png
          abort_confirmed_in_modal.png
          102 kB
        9. after_abort_processing.png
          after_abort_processing.png
          82 kB
        10. after_abort.png
          after_abort.png
          82 kB
        11. after_clicking_run.png
          after_clicking_run.png
          58 kB
        12. after_run_again_click_attempt_1.png
          after_run_again_click_attempt_1.png
          86 kB
        13. before_abort_attempt_1_after_2000ms_wait.png
          before_abort_attempt_1_after_2000ms_wait.png
          139 kB
        14. final_state_before_run_again.png
          final_state_before_run_again.png
          82 kB
        15. new_workflow_run_attempt_1.png
          new_workflow_run_attempt_1.png
          66 kB
        16. next_button_visible_attempt_1.png
          next_button_visible_attempt_1.png
          86 kB
        17. results_aborted.png
          results_aborted.png
          82 kB
        18. review_page_after_run_again_attempt_1.png
          review_page_after_run_again_attempt_1.png
          66 kB
        19. test.trace
          49 kB
        20. video.webm
          1.21 MB
        21. workflow_status_check.png
          workflow_status_check.png
          91 kB

              rh-ee-lsoffer Lior Soffer
              gharden1 Gary Harden
              Gary Harden Gary Harden
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: