Uploaded image for project: 'AI Platform Core Components'
  1. AI Platform Core Components
  2. AIPCC-5449

[QE] Torch 2.8 Test Plan Preparation

    • Icon: Spike Spike
    • Resolution: Done
    • Icon: Normal Normal
    • None
    • None
    • Accelerator Enablement
    • None
    • False
    • Hide

      None

      Show
      None
    • False
    • AIPCC Accelerators 16

      1. Test Imports
        By using upstream test folder to extract the packages imported  https://github.com/pytorch/pytorch/tree/main/test,
        import file: torch_imports_only.txt 
      2. Test probes for Import file torch_imports_only.txt ^^ 
      3. Bench marking - Compare with our torch wheels with Upstream
      4. Compare and contrasts the build and install dependency with upstream and our wheel and flag difference for possible mistmatch

      NOTE: This tests has to be run before with upstream torch so as to confirm the performance as well test validation from upstream.
                  Please Verify the TEST run on ACCELERATORS and not CPU.
                  Please check the torch_import_only file for functions to be tested 
       

      Category Script / Command Purpose Main Arguments / Options Notes / Reference
      Test Imports upstream test folder Extract imported packages N/A Upstream tests
      Data Benchmarks python samplers_benchmark.py Benchmark data samplers N/A Data README
      Override Benchmarks python bench.py Run override benchmarks N/A Override README
      Distributed / DDP torchrun --nproc_per_node=1 benchmark.py Distributed benchmark (ResNet50) --model resnet50 --world-size 1 --distributed-backend nccl --master-addr localhost --master-port 12355 Single-node example
      Distributed / DDP python benchmark.py Alternative run --world-size 1 --master-addr localhost --master-port 12355 Single-node example
      Framework Overhead python3 framework_overhead_benchmark.py Benchmark framework overhead --op add_op --num-warmup-iters 10 --num-iters 100 --use-throughput-benchmark --save Framework Overhead README
      Fuser python3 run_benchmarks.py Benchmark fused operators --operators add,sub,mul,div,... --shapes scalar,small,small_2d,... Fuser README
      GPT Fast python benchmark.py Benchmark GPT models N/A GPT Fast README
      Inductor Backends N/A Benchmark inductor backends N/A Inductor Backends
      Inference ./runner.sh <EXP_NAME>exp1 Benchmark inference performance N/A Can be time-consuming
      Instruction Counts python main.py Benchmark instruction counts N/A Can be time-consuming
      Nested python nested_bmm_bench.py Nested batch matrix multiplication benchmarks N/A Nested README
      Profiler python resnet_memory_profiler.py Profile ResNet memory usage Modify script to use CUDA  
      Profiler python3 profiler_bench.py Profile GPU performance --with-cuda --use-kineto --profiling-tensor-size 1024 --internal-iter 256 --with-stack --use-script  
      Serialization .py files Test serialization N/A Just run the files
      Sparse (DLMC) python3 -m dlmc.matmul_bench Sparse matrix benchmarks --path <dataset_path> --dataset magnitude_pruning --operation sparse@dense --with-cuda Requires dataset
      Transformer attention_bias_benchmarks.py Benchmark attention bias N/A  
      Transformer better_transformer_vs_mha_functional.py Compare BetterTransformer vs MHA N/A  

              rh-ee-vshaw Vikash Shaw
              rh-ee-alustosa Andre Lustosa Cabral de Paula Motta
              Frank's Team
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: