Uploaded image for project: 'Performance and Scale for AI Platforms'
  1. Performance and Scale for AI Platforms
  2. PSAP-1487

Learn more about LLM training resource estimation

XMLWordPrintable

    • Icon: Task Task
    • Resolution: Done
    • Icon: Normal Normal
    • Feb 11
    • None
    • None
    • None
    • RHOAI, Training
    • False
    • False
    • None
    • 2
    • PSAP - General-10, PSAP - General-11, PSAP - General-12

      IBM regressive estimator: https://github.com/foundation-model-stack/fm-training-estimator

       

      From this code and IBM presentation, we got curious about other methods to estimate resources for different cases (mixed_precision, which estimator...). 

       

      Main source: https://blog.eleuther.ai/transformer-math/

       

      The main issue is how to quantify CPU offloading. 

              rh-ee-aperdomo Alberto Perdomo
              rh-ee-aperdomo Alberto Perdomo
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: