-
Spike
-
Resolution: Unresolved
-
Undefined
-
None
-
None
-
False
-
-
False
-
-
Currently, there is no way to provide both a training and validation set to the training code. Ideally it would be possible to provide one of:
- 2 datasets (one for training, the other for validation)
- 1 dataset dict that contains a predefined train/val split
- 1 dataset and a percentage to randomly split the dataset into train and val
In addition, the user should be able to specify how frequently to evaluate the model on the validation dataset.
Then during the main training loop, the model's validation loss will be computed at the desired frequency and logged.
This is an essential component as it allows us to verify that the model is not overfitting to the training data, but has learned to generalize to unseen data as well.
- blocks
-
RHELAI-4007 Automated train run benchmarking GitHub action
-
- To Do
-