Uploaded image for project: 'Red Hat OpenShift Data Science'
  1. Red Hat OpenShift Data Science
  2. RHODS-6273

Fix performance issues in odh model controller

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Major Major
    • None
    • RHODS_1.20.0_GA
    • Model Serving
    • 5
    • False
    • None
    • False
    • Testable
    • No
    • Yes
    • No
    • Known Issue
    • Proposed
    • No
    • Pending
    • None
    • ML Serving Sprint 1.22, ML Serving Sprint 1.28, ML Serving Sprint 1.29

      The ODH model controller is currently using far too much memory and is getting OOMKilled.

      As part of this ticket, figure out what is causing the performance issues in the controller and fix it.

      This may require using tools such as pprof.

        1. heap-1
          30 kB
        2. heap-2
          45 kB
        3. heap-3
          53 kB
        4. heap-4
          54 kB
        5. image-2022-12-21-14-57-49-563.png
          image-2022-12-21-14-57-49-563.png
          24 kB
        6. image-2022-12-21-15-31-40-716.png
          image-2022-12-21-15-31-40-716.png
          18 kB
        7. modelmesh-controller.png
          modelmesh-controller.png
          44 kB
        8.  modelmesh-memory-leak.png
          modelmesh-memory-leak.png
          19 kB
        9. odh-model-controller.png
          odh-model-controller.png
          43 kB
        10. Screenshot from 2023-05-12 16-14-49.png
          Screenshot from 2023-05-12 16-14-49.png
          357 kB
        11. Screenshot from 2023-05-12 16-14-52.png
          Screenshot from 2023-05-12 16-14-52.png
          353 kB
        12. Screenshot from 2023-05-12 16-14-55.png
          Screenshot from 2023-05-12 16-14-55.png
          352 kB

            vajain Vaibhav Jain
            aasthana@redhat.com Anish Asthana
            Tarun Kumar Tarun Kumar
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

              Created:
              Updated:
              Resolved: