Uploaded image for project: 'Red Hat OpenShift Data Science'
  1. Red Hat OpenShift Data Science
  2. RHODS-147

3. Create notebook server environment

XMLWordPrintable

    • Create notebook server environment
    • False
    • False
    • No
    • To Do
    • 0% To Do, 0% In Progress, 100% Done
    • Undefined
    • No

      Data Science users initiate the process of model development by creating or importing Jupyter notebooks. The notebook will contain code to perform operations such as data access, data preparation, feature selection, model training and validation. As part of starting a new project, Data Science users need to be able to create & configure the notebook server environment that will serve as a foundation for all notebooks in the environment.

      Requirements for notebook server environment creation:

      1. P0: The system must support the ability to select the desired notebook image from a list of default notebook images (defined in epic for Support notebook images).  This image will serve as the starting point for notebooks in the environment.
      2. P0: The system must allow users to specify the desired number of OSD-provided NVIDIA-based GPUs based on what is available at the cluster level (i.e. should be list of GPU numbers rather than free form entry). Note: at a minimum, we want these values to be configurable.
      3. P0: The system must enable users to select the container size in terms of available CPU and memory resources. The system must support a list of available sizes.
      4. P0: The system must provide clear error messages for any error conditions and enable users to take action to quickly recover and address error cases.
      5. P1: The system must support the ability to clone a github repository into the environment. This will make all files in the specified git repository available in the environment.
      6. P0: The system must support the ability to connect the notebook server to AWS S3 storage so notebooks can access data in S3. The system must enable the connection to AWS S3 (eg. capture AWS credentials) so users don't have to re-enter AWS credentials any time they want to access S3 data for use in notebooks.
      7. P2: The system must support the ability to connect the notebook server to AWS RDS (Relational Database Service) so notebooks can access relational data.  
      8. P2: The system must support the ability to connect the notebook server to AWS Redshift so notebooks can access columnar data.
      9. P1: The system must support the ability to connect the notebook server to any installed service in MODH (eg. Red Hat Managed Kafka, Starburst) so notebooks can use these services.
      10. P2: The system must automatically populate connected services endpoints to environment variables so they can be used in notebooks.
      11. P0: The system must support the ability to connect the notebook server to a github repository so changes to files in MODH can be pushed to git and changes in git can be pulled into the MODH environment.
      12. P2: The system must support the ability for a user to create multiple concurrent notebook servers. The idea is that users may need to work on multiple projects at the same time, and if they were all done using the same server, they could experience resource issues (eg. insufficient memory).

       

      Questions/considerations:

      • for #3 - start with preconfigured sizes. 
      • for #4 - would like to provide some type of UI label/documentation to surface this capability beyond just generic environment variables.
      • #5 - the intent is that these are user-specific credentials. Need to ensure only user that entered credentials can access credentials. We don't want a backdoor for users to inadvertently share personal credentials with other users. 
      • #10 - need to research what's feasible today - ie. JL plugins, etc.
      • QUES: how would a user change the PV size?

              jkoehler@redhat.com Jacqueline Koehler
              jdemoss@redhat.com Jeff DeMoss
              Luca Giorgi Luca Giorgi
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: