-
Epic
-
Resolution: Done
-
Major
-
None
-
None
-
None
-
None
-
LLM FSDP training
-
RHOAI, Training
-
Not Selected
-
False
-
False
-
None
-
0% To Do, 0% In Progress, 100% Done
now that we have multi-node training working, let's use FSDP and see what are the benefits of offloading some work to the CPU.
for this training session, I will start with the granite-7b module which is shipped with RHEL and in case thing wont work out might switch to Meta-Llama-3-8B
There are no Sub-Tasks for this issue.