Uploaded image for project: 'Serverless logic'
  1. Serverless logic
  2. SRVLOGIC-221

Pod instances keep spawning and terminating when deploying the workflow

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • 1.32.0
    • 1.32.0
    • serverless-workflow
    • None
    • 2024 Week 07-09 (from Feb 12), 2024 Week 10-12 (from Mar 4), 2024 Week 13-15 (from Mar 25)
    • Important

      When deploying the sonataflow-inference-demo CR, the kubernetes API server starts spawning multiple pods that end up being in terminating state seconds after they're spawned. At the end, only 1 pod remains in Running state as expected, but there are many pods still in Terminating state that are not cleaned up. This issue does not depend on the namespace: it happens in both default and a newly created namespace.

      Sonataflow-inference-demo repository:

      https://github.com/hbelmiro/sonataflow-inference-pipeline-demo

      Steps to reproduce

      • Deploy the latest sonataflow operator version (use the code in main and not the latest image)
      • Change the base image referenced in the 01-sonataflow-platform.yaml to quay.io/ricardozanini/sonataflow-python-devmode:latest
      • Run the deploy.sh command

      Expected result:

      • One pod is created that eventually reaches the Running/Ready state

      Actual result:

      • Multiple pods are created that reach the Terminating state shortly, while only 1 reaches the Running,Ready state.

       

      $>oc get pod
      NAME                        READY   STATUS        RESTARTS   AGE
      pipeline-55bb5b889d-lz22x   0/1     Terminating   0          17s
      pipeline-5c8bd7db86-dvqlw   0/1     Terminating   0          17s
      pipeline-fc9b44cdb-jdwsv    1/1     Running       0          4m17s

            wmedvede@redhat.com Walter Medvedeo
            wmedvede@redhat.com Walter Medvedeo
            Dominik Hanak Dominik Hanak
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: