Uploaded image for project: 'AI Platform Core Components'
  1. AI Platform Core Components
  2. AIPCC-7981

[QA][PyTorch UT][CPU][sGPU] test_cpp_extensions_jit tests are failing because of missing cudnn.h header file

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • PyTorch
    • False
    • Hide

      None

      Show
      None
    • False
    • PyTorch Sprint 21, PyTorch Sprint 22, PyTorch Sprint 23

      test_cpp_extensions_jit tests are failing because of missing cudnn.h header file on main branch

      Tests Failing:
      test_jit_cudnn_extension

      Env details:
      PyTorch version: 2.10.0
      Branch: main
      OS: RHEL 9.6
      CPU: Intel
      python version: 3.12
      commit id : 6de6685797cabc6256df76803f3a5f772d5275a7 (tag: trunk/6de6685797cabc6256df76803f3a5f772d5275a7, origin/main, origin/HEAD)

      Steps to repro:
      1. Log in to H200.
      2. Login to quay.io: podman login quay.io
      3. Pull base image: podman pull quay.io/aipcc/pytorch:rhel9_6_pytorch_main_gitd766976_cuda12_8
      4. Run the image and specify the GPU to be used: podman run -it <IMAGE_NAME>
      5. Run the PyTorch UT: TEST_CONFIG=cpu python3 test/run_test.py -i test_cpp_extensions_jit -k test_jit_cudnn_extension

      Expected result: UTs should run fine.
      Actual result: Missing cudnn.h header file causing compilation failure, which leads to RuntimeError when building the extension and ImportError when trying to load the shared object file

      Logs are attached below

              rh-ee-nkangana Nayan Bhushan Kanganahalli Nagabhushana
              rh-ee-nkangana Nayan Bhushan Kanganahalli Nagabhushana
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: