• Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Undefined Undefined
    • None
    • rhel-7.3
    • glibc
    • Major
    • sst_pt_libraries
    • False
    • Hide

      None

      Show
      None
    • If docs needed, set a value

      Description of problem:
      mpicc compiled as part of our local OpenMPI installation when compiled with ICC 16.0 (but not ICC 15.0) triggers a segv when run. This appears to be an ABI break between glibc 7.2 and 7.3. When we revert the glibc to the one from 7.2 the problem goes away.

      15:22 foraker: #2 elf_dynamic_do_Rela (skip_ifunc=<optimized out>, lazy=<optimized out>, nrelative=<optimized out>, relsize=<optimized out>, reladdr=<optimized out>, map=0x2aaaaab0a548) at do-rel.h:170
      15:22 foraker: #3 _dl_relocate_object (scope=<optimized out>, reloc_mode=<optimized out>, consider_profiling=<optimized out>, consider_profiling@entry=0) at dl-reloc.c:259
      15:22 foraker: #4 0x00002aaaaaaae792 in dl_main (phdr=<optimized out>, phdr@entry=0x400040, phnum=<optimized out>, phnum@entry=9, user_entry=user_entry@entry=0x7fffffffd428, auxv=<optimized out>) at rtld.c:2192
      15:22 foraker: #5 0x00002aaaaaac1e36 in _dl_sysdep_start ( start_argptr=start_argptr@entry=0x7fffffffd4e0, dl_main=dl_main@entry=0x2aaaaaaac820 <dl_main>) at ../elf/dl-sysdep.c:244
      15:22 foraker: #6 0x00002aaaaaaafa31 in _dl_start_final (arg=0x7fffffffd4e0) at rtld.c:318
      15:22 foraker: #7 _dl_start (arg=0x7fffffffd4e0) at rtld.c:544
      15:22 foraker: #8 0x00002aaaaaaac1e8 in _start () from /lib64/ld-linux-x86-64.so.2
      15:22 foraker: #9 0x0000000000000001 in ?? ()
      15:22 foraker: #10 0x00007fffffffd8e7 in ?? ()
      15:22 foraker: #11 0x0000000000000000 in ?? ()

      Version-Release number of selected component (if applicable):
      glibc-2.17-157.el7.x86_64

      How reproducible:

      Steps to Reproduce:
      15:02 foraker: quartz2

      {foraker1}:module load intel openmpi-intel/1.10
      15:02 foraker: quartz2{foraker1}

      :mpicc
      15:02 foraker: Segmentation fault

      Additional info:
      Carlos asked me to file a bug:
      15:50 codonell: neb, File a bug please. Include /proc/cpuinfo please.
      15:51 codonell: neb, And include exactly what the crash looks like.
      15:51 codonell: neb, e.g. SIGILL, SIGSEGV...
      15:51 codonell: neb, etc. etc.
      15:51 neb: file a bug?
      15:51 codonell: neb, We've made some changes in rhel-7.3 for Intel Purley hardware so this area has new code.
      15:51 codonell: neb, Yes please.
      15:52 neb: Like what is the proximate cause, I'm still trying to make heads or tails of it.
      15:52 neb: can you give me a hand-wavy explanation of what may be going on?
      15:53 neb: could it be fixed with a recompile?
      15:53 codonell: neb, No ABI should be broken.
      15:53 codonell: neb, So it's not about recompiling.
      15:54 codonell: neb, It's about the hardware you're running on.
      15:54 codonell: neb, The particular line you quote is checking to see if AVX512F is usable, but that should be a quick look into a feature table.
      15:54 codonell: neb, Nothing should ever crash there.

            rhn-support-codonell Carlos O'Donell
            rhn-gps-woodard Ben Woodard
            Carlos O'Donell Carlos O'Donell
            qe-baseos-tools-bugs@redhat.com qe-baseos-tools-bugs@redhat.com qe-baseos-tools-bugs@redhat.com qe-baseos-tools-bugs@redhat.com
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: