Uploaded image for project: 'Red Hat OpenShift AI Strategic Project'
  1. Red Hat OpenShift AI Strategic Project
  2. RHOAISTRAT-50

Near Edge deployment within OpenShift AI

XMLWordPrintable

    • Icon: Feature Feature
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • None
    • Edge
    • False
    • Hide

      None

      Show
      None
    • False
    • RHOAISTRAT-73OpenShift AI supports the ability to easily deploy any model to any location including edge locations
    • 49
    • 49% 49%

      Note: This Feature is currently being refined in https://docs.google.com/document/d/1_8Z4YNn4ua7MLGGp124J-e6_XdnjPZSA-9U2iG-h6ug.

      Feature Overview

      For ML Ops Engineers using RHOAI to successfully deploy a model at the near-edge, with confidence that it's working properly, predictions are correct, and is always available.

      Goals

      The goal is to provide fully supported model serving capabilities in a distributed topology with SNO and Microshift footprints specifically. This means a containerized model with all the dependencies needed for serving is running in an environment with moderate disadvantaged resources. 

      Requirements

      Environment

      The target OCP footprints for this capabilities are:

      • Single Node OpenShift (must have based on customer request)
      • Microshift (nice to have)

      Model serving

      The minimal capabilities for serving the model are:

      1. An operator that can fit the desired footprint that installs and configures the components needed at the near edge node (ej. ArgoCD for edge, ACM for edge, monitoring, observability, public API service). 
      2. [P0] A supported model runtime based on OpenVino with PyTorch and Tensorflow
      3. [P0] Automated build of the containerized model using OpenShift pipelines supported also at the edge node.
      4. Limited management capabilities:
      5. [P0] Support for model upgrades based on pulling the target version from a local Quay repo, when the desired state of the model changes in git,  and reconciling the desired state  against the local k8s api of MicroShift or SNO.
      6. Support for model upgrades in environments with limited connectivity. (Network proxies, disconnected periods, low bandwidth to send a container image) 
      7. [P0]A public API to retrieve: Model general information, availability, health and logs.
      8. [P0]A public API at the core for MLOps engineers to interact with the CI pipelines.
      9. [P0]A public API at the near edge to interact with the management and monitoring capabilities.
      10. [P0] Support for exporting prometheus metrics for: latency and availability of the service.
      11. Zero down time upgrades (Nice to have)
      12. Resource consumption of all the components at the edge should fit ABB resource constraints
      13. Support to forward the metrics collected at the edge node to the core location (Nice to have) 
      14. Visualization of the metrics in a prometheus dashboard (Nice to have).
      15. Other monitoring metrics (Nice to have)
      16. Resource consumption should be as minimal as possible (Nice to have). 

      Background

      Lessons learned from customers:

      • Deploying models at the near edge is preferred using a GitOps approach. 
      • Serving models at the edge using a lightweight model server that runs on an immutable container
      • Sending back the performance metrics to the core RHOAI instance to be visualized in the dashboard is a differentiator.

      The user stories for this epic can be found here:

      https://docs.google.com/document/d/16_jl5yk3wDLcmxy3PXtw0sI42Y19T_jbfI5tHn5ZvIY/edit#heading=h.1dc7etmadurz

      Current mockups

      https://docs.google.com/presentation/d/1W9wHeASZsxz4W1UzyN-HfQCqeMhI0dm4IpyxnPBALLo/edit#slide=id.g29607e927c0_0_225

      What is the near edge?

      Typically conventional server hardware or similar—the same that might be deployed in a data center—at the edge of the network. These powerful computers run full-scale server operating systems (typically Linux or Windows) and can be treated in the same way as any other cloud server. If they have access to AI-specific acceleration, it’s likely in the form of GPUs. Some edge servers are sold in ruggedized form factors that are better suited to industrial settings (like a factory floor) than their data center dwelling equivalents.

      The power of edge servers means that they can provide many of the benefits of cloud compute while maintaining the security, privacy, and convenience that comes with keeping data on-site. For some applications, they can provide the best of both worlds—high-capability hardware, low latency, reduced risk of data leakage, and economic use of bandwidth.

      In the context of Red Hat Portfolio Near edge computing  is running workloads in  a distributed topology where:

      • There is a constraint in resources.
      • Has k8s of some flavor.

      Non-goals

      • Device management
      • non-OCP environments.
      • The feature should be available from within OpenShift UI, all the required components for automating, deploying, serving, monitoring and managing the fleet of edge servers should be integrated with the platform.
      • Multi-tenancy: None of the multi-tenancy features will be available for Microshift
      • High availability and upscaling: None of the HA or upscaling features features will be available for Microshift
      • Multi-cluster management: GitOps on Microshift will not support managing multi clusters from a single instance. This severely affects the footprint of any GitOps instance.

       

      This feature is something we want to showcase for Summit 2024. 

       

       

            llasmith@redhat.com Landon LaSmith
            mfentane@redhat.com Myriam Fentanes
            Myriam Fentanes Myriam Fentanes
            Votes:
            0 Vote for this issue
            Watchers:
            16 Start watching this issue

              Created:
              Updated: