Uploaded image for project: 'Red Hat Enterprise Linux AI'
  1. Red Hat Enterprise Linux AI
  2. RHELAI-3893

Create Dataset processing utility

XMLWordPrintable

      Goal: 

      currently, all dataset processing related functionality lives in `token_dataset` and `data_process` in the training library. These modules should be further split up into as many individuals units as possible. Allowing for users to pick and choose specifically which data processing functionality they want to utilize from the training library.

      Also consider refactoring the class structure, mode of access (argparse cli vs importable classes, click cli) in this process.

       

      Acceptance Criteria:

      Training library data processing has been re-factored into multiple individual modules that each have a distinct use and purpose.

              cdoern@redhat.com Charles Doern
              cdoern@redhat.com Charles Doern
              Fynn Schmitt-Ulms
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: