Uploaded image for project: 'AI Platform Core Components'
  1. AI Platform Core Components
  2. AIPCC-7997

[QA][PyTorch UT][GPU] inductor.test_compile_subprocess tests are failing because of NotImplementedError

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Undefined Undefined
    • None
    • None
    • PyTorch
    • False
    • Hide

      None

      Show
      None
    • True
    • PyTorch Sprint 21, PyTorch Sprint 22, PyTorch Sprint 23

      inductor.test_compile_subprocess tests are failing because of NotImplementedError: torchvision::roi_align operator not available for CUDA backend on main branch

      Tests Failing:
      test_roi_align_cuda

      Env details:
      PyTorch version: 2.10.0
      Branch: main
      OS: RHEL 9.6
      CPU: Intel
      python version: 3.12
      commit id : 6de6685797cabc6256df76803f3a5f772d5275a7 (tag: trunk/6de6685797cabc6256df76803f3a5f772d5275a7, origin/main, origin/HEAD)

      Steps to repro:
      Log in to H200.
      Login to quay.io: podman login quay.io
      Pull base image: podman pull quay.io/aipcc/pytorch:rhel_cuda_build_without_pins
      Run the image and specify the GPU to be used: podman run -it <IMAGE_NAME>
      Run the PyTorch UT: TEST_CONFIG=cpu python3 test/run_test.py -i inductor.test_compile_subprocess

      Expected result: UTs should run fine.
      Actual result: NotImplementedError: Could not run 'torchvision::roi_align' with arguments from the 'CUDA' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). The operator is only available for CPU and other backends, but not for CUDA backend.

      Logs are attached below

              rh-ee-ktanmay Kumar Tanmay
              rh-ee-nkangana Nayan Bhushan Kanganahalli Nagabhushana
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: