Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-6179

Segfault after MPI_Finalize

Linking RHIVOS CVEs to...Migration: Automation ...SWIFT: POC ConversionSync from "Extern...XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Won't Do
    • Icon: Undefined Undefined
    • None
    • rhel-8.4.0
    • openmpi
    • None
    • Moderate
    • 1
    • rhel-net-drivers
    • ssg_networking
    • 1
    • False
    • False
    • Hide

      None

      Show
      None
    • None
    • Network Drivers 6
    • None
    • None
    • If docs needed, set a value
    • None
    • 57,005

      Steps to reproduce problem:

      """
      $ cat > test.cc<<EOF
      #include <thread>
      #include <mpi.h>

      void thread_mpi() {
      int mpi_threads_provided;
      MPI_Init_thread(nullptr, nullptr, MPI_THREAD_SINGLE, &mpi_threads_provided);
      MPI_Finalize();
      }

      int main() {
      std::thread thread = std::thread(thread_mpi);
      thread.join();
      return 0;
      }
      EOF

      $ module load mpi/openmpi-x86_64
      g++ $(pkg-config ompi-cxx --cflags) $(pkg-config ompi-cxx --libs) test.cc -o test
      """

      User has a python environment, and above is a reduced reproducer.

      There are some workarounds, but still the problem should be fixed in mpi.

      The crash happens in /lib64/libpmix.so.2

      User is not using environemt-modules, but if using, one workaround is to explicitly link with pmix, for example:

      """

      1. dnf install -y pmix-devel
        $ g++ $(pkg-config ompi-cxx --cflags) $(pkg-config ompi-cxx --libs) $(pkg-config --libs pmix) test.cc -o test
        """

      Because user is using python scripts, the workaround was a suggestion to load some pmix python module, what appears to have worked, but again, this appears to be just hidding a problem.

      A non tested workaround is that it might work with this pseudo-patch:

      """
      -int pmix_tsd_keys_destruct()
      +_attribute_((destructor)) int pmix_tsd_keys_destruct()
      """

      in pmix_source/src/threads/thread.c

      but not guaranteed, as the dependency chain leading to loading the pmix library is complex.

              network-drivers-bugs@redhat.com network-drivers-bugs group
              rhn-support-pandrade Paulo Andrade
              RH Bugzilla Integration RH Bugzilla Integration
              infiniband-qe infiniband-qe infiniband-qe infiniband-qe
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: