Discussion:
[OMPI users] Library interposing works with 1.6.5 but not with 2.0.1 (Fortran)?
Alef Farah
2017-01-16 17:52:07 UTC
Permalink
Hi,

I contribute to a tracing library which uses PMPI. It's loaded with
LD_PRELOAD so it interposes libmpi, intercepting MPI_ calls. Since we
upgraded from OpenMPI 1.6.5 to OpenMPI 2.0.1 it seems to have stopped
intercepting the calls from Fortran application, although it continues
to work with C applications. For instance, adding breakpoints to MPI
calls with gdb in a certain Fortran application one gets:

Breakpoint 1, 0x00007ffff794bae0 in PMPI_Init () from
/home/afh/install/openmpi-2.0.1/b/lib/libmpi.so.20.0.1
(gdb) where
#0 0x00007ffff794bae0 in PMPI_Init () from
/home/afh/install/openmpi-2.0.1/b/lib/libmpi.so.20.0.1
#1 0x00007ffff729d638 in pmpi_init__ () from
/home/afh/install/openmpi-2.0.1/b/lib/libmpi_mpifh.so.20
#2 0x0000000000401197 in MAIN__ ()
#3 0x000000000040210f in main ()

When using 2.0.1. Notice there is no sign of the tracing library,
whereas with 1.6.5 it works as intended:

Breakpoint 1, 0x00007ffff7bd3c34 in MPI_Init ()
from /home/afh/svn/akypuera/b/lib/libaky.so
(gdb) where
#0 0x00007ffff7bd3c34 in MPI_Init ()
from /home/afh/svn/akypuera/b/lib/libaky.so
#1 0x00007ffff763f218 in pmpi_init__ () from /usr/lib/libmpi_f77.so.1
#2 0x00000000004010d7 in MAIN__ ()
#3 0x000000000040204f in main ()

It seems that with OpenMPI 1.6.5 libmpi_f77 is used, whereas with 2.0.1
libmpi_mpifh is used and the calls are not intercepted for some reason.
Any ideas? The only change made to the library's code was matching MPI's
C API changes (added const qualifiers to read-only buffers), so I don't
think that had anything to do with it.

Links to the application (NAS EP benchmark) and the tracing library can
be found attached, as well as the output of ldd for various
configurations, and config.log for my OpenMPI 2.0.1 build.

Loading...