Discussion:
[OMPI users] Multi-threading support for openib
Daniel Cámpora
2013-11-27 14:14:10 UTC
Permalink
Dear list,

I've gone through several hours of configuring and testing to get a grasp
of the current status for multi-threading support.

I want to use a program with MPI_THREAD_MULTIPLE, over the openib BTL. I'm
using openmpi-1.6.5 and SLC6 (rhel6), for what's worth.

Upon configuring my own openmpi library, if I just
--enable-mpi-thread-multiple, and execute my program with -mca btl openib,
it simply crashes upon openib not supporting MPI_THREAD_MULTIPLE.

I've only started testing with --enable-opal-multi-threads, just in case it
would help me. Configuring with the aforementioned options,
./configure --enable-mpi-thread-multiple --enable-opal-multi-threads

results in a crash whenever executing my program,

$ mpirun -np 2 -mca mca_component_path
/usr/mpi/gcc/openmpi-1.6.5/lib64/openmpi -mca btl openib -mca
btl_openib_warn_default_gid_prefix 0 -mca btl_base_verbose 100 -mca
btl_openib_verbose 100 -machinefile machinefile.labs `pwd`/em_bu_app 2>&1 |
less
--------------------------------------------------------------------------
It looks like opal_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during opal_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

opal_shmem_base_select failed
--> Returned value -1 instead of OPAL_SUCCESS
--------------------------------------------------------------------------
[lab14:13672] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in file
runtime/orte_init.c at line 79
[lab14:13672] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in file orterun.c
at line 694


Several questions related to these. Does --enable-opal-multi-threads have
any impact on the BTL multi-threading support? (If there's more
documentation on what this does I'd be glad to read it).

Is there any additional configuration tag necessary for enabling
opal-multi-threads to work?

Cheers, thanks a lot!

Daniel
--
Daniel Hugo Cámpora Pérez
European Organization for Nuclear Research (CERN)
PH LBC, LHCb Online Fellow
e-mail. ***@cern.ch
Ralph Castain
2013-11-27 14:59:19 UTC
Permalink
Openib does not currently support thread multiple - hopefully in 1.9 series

Sent from my iPhone
Post by Daniel Cámpora
Dear list,
I've gone through several hours of configuring and testing to get a grasp of the current status for multi-threading support.
I want to use a program with MPI_THREAD_MULTIPLE, over the openib BTL. I'm using openmpi-1.6.5 and SLC6 (rhel6), for what's worth.
Upon configuring my own openmpi library, if I just --enable-mpi-thread-multiple, and execute my program with -mca btl openib, it simply crashes upon openib not supporting MPI_THREAD_MULTIPLE.
I've only started testing with --enable-opal-multi-threads, just in case it would help me. Configuring with the aforementioned options,
./configure --enable-mpi-thread-multiple --enable-opal-multi-threads
results in a crash whenever executing my program,
$ mpirun -np 2 -mca mca_component_path /usr/mpi/gcc/openmpi-1.6.5/lib64/openmpi -mca btl openib -mca btl_openib_warn_default_gid_prefix 0 -mca btl_base_verbose 100 -mca btl_openib_verbose 100 -machinefile machinefile.labs `pwd`/em_bu_app 2>&1 | less
--------------------------------------------------------------------------
It looks like opal_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during opal_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
opal_shmem_base_select failed
--> Returned value -1 instead of OPAL_SUCCESS
--------------------------------------------------------------------------
[lab14:13672] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in file runtime/orte_init.c at line 79
[lab14:13672] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in file orterun.c at line 694
Several questions related to these. Does --enable-opal-multi-threads have any impact on the BTL multi-threading support? (If there's more documentation on what this does I'd be glad to read it).
Is there any additional configuration tag necessary for enabling opal-multi-threads to work?
Cheers, thanks a lot!
Daniel
--
Daniel Hugo Cámpora Pérez
European Organization for Nuclear Research (CERN)
PH LBC, LHCb Online Fellow
_______________________________________________
users mailing list
http://www.open-mpi.org/mailman/listinfo.cgi/users
Jeff Hammond
2013-11-27 20:25:42 UTC
Permalink
See slide 22 of http://www.open-mpi.org/papers/sc-2013/Open-MPI-SC13-BOF.pdf

Jeff
Post by Ralph Castain
Openib does not currently support thread multiple - hopefully in 1.9 series
Sent from my iPhone
Dear list,
I've gone through several hours of configuring and testing to get a grasp of
the current status for multi-threading support.
I want to use a program with MPI_THREAD_MULTIPLE, over the openib BTL. I'm
using openmpi-1.6.5 and SLC6 (rhel6), for what's worth.
Upon configuring my own openmpi library, if I just
--enable-mpi-thread-multiple, and execute my program with -mca btl openib,
it simply crashes upon openib not supporting MPI_THREAD_MULTIPLE.
I've only started testing with --enable-opal-multi-threads, just in case it
would help me. Configuring with the aforementioned options,
./configure --enable-mpi-thread-multiple --enable-opal-multi-threads
results in a crash whenever executing my program,
$ mpirun -np 2 -mca mca_component_path
/usr/mpi/gcc/openmpi-1.6.5/lib64/openmpi -mca btl openib -mca
btl_openib_warn_default_gid_prefix 0 -mca btl_base_verbose 100 -mca
btl_openib_verbose 100 -machinefile machinefile.labs `pwd`/em_bu_app 2>&1 |
less
--------------------------------------------------------------------------
It looks like opal_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during opal_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
opal_shmem_base_select failed
--> Returned value -1 instead of OPAL_SUCCESS
--------------------------------------------------------------------------
[lab14:13672] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in file
runtime/orte_init.c at line 79
[lab14:13672] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in file orterun.c at line 694
Several questions related to these. Does --enable-opal-multi-threads have
any impact on the BTL multi-threading support? (If there's more
documentation on what this does I'd be glad to read it).
Is there any additional configuration tag necessary for enabling
opal-multi-threads to work?
Cheers, thanks a lot!
Daniel
--
Daniel Hugo Cámpora Pérez
European Organization for Nuclear Research (CERN)
PH LBC, LHCb Online Fellow
_______________________________________________
users mailing list
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
http://www.open-mpi.org/mailman/listinfo.cgi/users
--
Jeff Hammond
***@gmail.com
Loading...