Uploaded image for project: 'Red Hat OpenShift Data Science'
  1. Red Hat OpenShift Data Science
  2. RHODS-5001

Admin user can add unsupported toleration to notebook pods; spawning is impossible until toleration is changed

XMLWordPrintable

    • False
    • None
    • False
    • Release Notes
    • Yes
    • No
    • Hide
      Admin users could add non-valid tolerations to notebook pods
      An admin user could add non-valid tolerations on the *Cluster settings* page without triggering an error. If a non-valid toleration was added, users were unable to successfully start notebooks. The toleration key is now checked in real-time. If an invalid toleration name is entered, an error message displays indicating valid toleration names consist of alphanumeric characters, '-', '_', or '.', and must start and end with an alphanumeric character.
      Show
      Admin users could add non-valid tolerations to notebook pods An admin user could add non-valid tolerations on the *Cluster settings* page without triggering an error. If a non-valid toleration was added, users were unable to successfully start notebooks. The toleration key is now checked in real-time. If an invalid toleration name is entered, an error message displays indicating valid toleration names consist of alphanumeric characters, '-', '_', or '.', and must start and end with an alphanumeric character.
    • Documented as Resolved Issue
    • No
    • Yes
    • None
    • RHODS 1.17

      Description of problem:

      In the new Admin UI to set notebook tolerations (RHODS-3624) an admin user is able to add an unsupported toleration (e.g. "TestToleration, AndAnotherKey"). The setting can be saved as is.
      Trying to spawn notebooks after this setting has been saved will become impossible (the modal will show no progress in the loading bar) but no error message is shown to the user.
      Looking into the logs of the `notebook-controller-deployment` we can see this error:

      1.6617843718307014e+09
      	DEBUG	events	Warning	{"object": 
      {"kind":"Notebook","namespace":"rhods-notebooks","name":"jupyter-nb-ldap-2dadmin4","uid":"d30182b5-c9b5-4d66-8815-9359b6484a20","apiVersion":"kubeflow.org/v1beta1","resourceVersion":"334720"},
       "reason": "FailedCreate", "message": "Reissued from 
      statefulset/jupyter-nb-ldap-2dadmin4: create Pod 
      jupyter-nb-ldap-2dadmin4-0 in StatefulSet jupyter-nb-ldap-2dadmin4 
      failed error: Pod \"jupyter-nb-ldap-2dadmin4-0\" is invalid: 
      spec.tolerations[0].key: Invalid value: \"TestToleration, 
      AndAnotherKey\": name part must consist of alphanumeric characters, '-',
       '_' or '.', and must start and end with an alphanumeric character (e.g.
       'MyName',  or 'my.name',  or '123-abc', regex used for validation is 
      '([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9]')"} 

      Important part being:

      Pod \"jupyter-nb-ldap-2dadmin4-0\" is invalid: spec.tolerations[0].key: Invalid value: \"TestToleration, AndAnotherKey\": name part must consist of alphanumeric characters, '-', '_' or '.', and must start and end with an alphanumeric character (e.g. 'MyName', or 'my.name', or '123-abc', regex used for validation is '([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9]') 

      The only way to fix the issue is to remove (or update) the toleration from the admin UI to something that is allowed by openshift.

       

      Prerequisites (if any, like setup, operators/versions):

      Latest rhods 1.16 live build (quay.io/llasmith/odh-operator-container-live:1.16.0-dashboard-kfnbc)

      Steps to Reproduce

      1. Log in as rhods admin
      2. Go to Cluster settings in RHODS dashboard
      3. Add a toleration that is unsupported by openshift
      4. Save the toleration
      5. Try spawning notebooks

      Actual results:

      User is able to save the setting with an unsupported toleration

      Spawning is impossible but no error message is shown to the user

      Expected results:

      User should not be allowed to set an unsupported toleration

      Meaningful error message should be shown to the user in order to be able to fix the problem without having to look in the logs of a pod

      Reproducibility (Always/Intermittent/Only Once):

      always

      Build Details:

      Workaround:

      Additional info:

              aballantyne Andrew Ballantyne
              rhn-support-lgiorgi Luca Giorgi
              Luca Giorgi Luca Giorgi
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated:
                Resolved: