Uploaded image for project: 'Red Hat OpenShift Data Science'
  1. Red Hat OpenShift Data Science
  2. RHODS-4621

UI front end for Serving Models in ODH core

XMLWordPrintable

      Bugs

      Requirement 1

      P0: Users must be able to configure a server for the model
      P1: Specify target platform configuration (eg. compute resources - CPU, memory, GPU) for served models

      Issues

      Requirement 2

      P0: Model storage. Users must be able to to deploy a model stored in a S3 location
      P0: Model frameworks: Users must be able to serve models based on a variety of frameworks
      P0: Ability to serve models not developed in RHODS

      Issues

      Requirement 3

      P0: Ability to view list of deployed models for a project

      • Ability to access endpoint
      • Ability to view monitoring and performance metrics

      P0: The system must help indicate the health (are they up) of endpoints for deployed models
      P1: Support multi-model serving; ability to serve multiple models on one server

      Requirement 4

      P0: Users must be able to easily retrieve the endpoint for a served model (to use for inference, either testing or incorporating into an app)

      • P0: Users must be able to secure endpoints so they are not publicly available: Authentication & authorization capabilities

      Issues

      Requirement 5

      P0: Ability to view global list of all deployed models (across all projects)

      • Filtering / search capabilities
      • Users view all models deployed within projects they have access to, admins view all

      Issues

      Requirement 6

      P1: Ability to delete a model

      Issues

      Requirement 7

      P0: Manually add a new version for served model & deploy (replace)
      P0: Edit model server
      P1: Deploy the new version of the model - exist with previous; multiple deployed endpoints ----> TODO: Review

      Issues

       

      Requirement 8 (Targeted for 1.21)

      P0: Inference performance metrics. Users must be able to access performance metrics for all deployed models

      • P0: Inference performance - latency (avg. time to process 1 input)
      • P0: Target metrics for v1:
        • Avg. response time over period of time (eg. last 24 hours or last week/month to gauge trends over time)
        • Number of requests over defined period of time (including option for all time)

      Issues

       

              lferrnan@redhat.com Lucas Fernandez Aragon
              jkoehler@redhat.com Jacqueline Koehler
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: