Discussion:
[OMPI users] OMPI users] MPI_THREAD_MULTIPLE: Fatal error on MPI_Win_create
Gilles Gouaillardet
2017-02-19 12:13:17 UTC
Permalink
Joseph,

Would you mind trying again with
export OMPI_MCA_osc=^pt2pt
export OMPI_MCA_osc_base_verbose=10

If it still does not work, then please post the output

Cheers,

Gilles
Hi Howard,
Thanks for your quick reply and your suggestions. I exported both variables as you suggested but neither has any impact. The error message stays the same with both env variables set. Is there any other way to get more information from OpenMPI?
Sorry for not mentioning my OS. I'm running on a Linux Mint 18.1 with stock kernel 4.8.0-36.
Joseph
Hi Joseph
What OS are you using when running the test?
Could you try running with
export OMPI_mca_osc=^pt2pt
and
export OMPI_mca_osc_base_verbose=10
This error message was put in to this OMPI release because this part of the code has known problems when used multi threaded.
All,
I am seeing a fatal error with OpenMPI 2.0.2 if requesting support for
MPI_THREAD_MULTIPLE and afterwards creating a window using
MPI_Win_create. I am attaching a small reproducer. The output I get is
```
MPI_THREAD_MULTIPLE supported: yes
MPI_THREAD_MULTIPLE supported: yes
MPI_THREAD_MULTIPLE supported: yes
MPI_THREAD_MULTIPLE supported: yes
--------------------------------------------------------------------------
The OSC pt2pt component does not support MPI_THREAD_MULTIPLE in this
release.
Workarounds are to run on a single node, or to use a system with an RDMA
capable network such as Infiniband.
--------------------------------------------------------------------------
[beryl:10705] *** An error occurred in MPI_Win_create
[beryl:10705] *** reported by process [2149974017,2]
[beryl:10705] *** on communicator MPI_COMM_WORLD
[beryl:10705] *** MPI_ERR_WIN: invalid window
[beryl:10705] *** MPI_ERRORS_ARE_FATAL (processes in this communicator
will now abort,
[beryl:10705] ***    and potentially your MPI job)
[beryl:10698] 3 more processes have sent help message help-osc-pt2pt.txt
/ mpi-thread-multiple-not-supported
[beryl:10698] Set MCA parameter "orte_base_help_aggregate" to 0 to see
all help / error messages
[beryl:10698] 3 more processes have sent help message
help-mpi-errors.txt / mpi_errors_are_fatal
```
I am running on a single node (my laptop). Both OpenMPI and the
application were compiled using GCC 5.3.0. Naturally, there is no
support for Infiniband available. Should I signal OpenMPI that I am
indeed running on a single node? If so, how can I do that? Can't this be
detected by OpenMPI automatically? The test succeeds if I only request
MPI_THREAD_SINGLE.
OpenMPI 2.0.2 has been configured using only
--enable-mpi-thread-multiple and --prefix configure parameters. I am
attaching the output of ompi_info.
Please let me know if you need any additional information.
Cheers,
Joseph
--
Dipl.-Inf. Joseph Schuchart
High Performance Computing Center Stuttgart (HLRS)
Nobelstr. 19
D-70569 Stuttgart
Tel.: +49(0)711-68565890
Fax: +49(0)711-6856832
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
Joseph Schuchart
2017-02-19 17:48:46 UTC
Permalink
Gilles,

Sure, this time I see more output (seems there was a typo in the env
variable earlier):

```
$ echo $OMPI_MCA_osc
^pt2pt
$ echo $OMPI_MCA_osc_base_verbose
10
$ mpirun -n 2 ./a.out
[beryl:12905] mca: base: components_register: registering framework osc
components
[beryl:12904] mca: base: components_register: registering framework osc
components
[beryl:12904] mca: base: components_register: found loaded component rdma
[beryl:12904] mca: base: components_register: component rdma register
function successful
[beryl:12904] mca: base: components_register: found loaded component sm
[beryl:12904] mca: base: components_register: component sm has no
register or open function
[beryl:12904] mca: base: components_open: opening osc components
[beryl:12904] mca: base: components_open: found loaded component rdma
[beryl:12904] mca: base: components_open: found loaded component sm
[beryl:12904] mca: base: components_open: component sm open function
successful
[beryl:12905] mca: base: components_register: found loaded component rdma
[beryl:12905] mca: base: components_register: component rdma register
function successful
[beryl:12905] mca: base: components_register: found loaded component sm
[beryl:12905] mca: base: components_register: component sm has no
register or open function
[beryl:12905] mca: base: components_open: opening osc components
[beryl:12905] mca: base: components_open: found loaded component rdma
[beryl:12905] mca: base: components_open: found loaded component sm
[beryl:12905] mca: base: components_open: component sm open function
successful
[beryl:12904] *** An error occurred in MPI_Win_create
[beryl:12904] *** reported by process [2609840129,0]
[beryl:12904] *** on communicator MPI_COMM_WORLD
[beryl:12904] *** MPI_ERR_WIN: invalid window
[beryl:12904] *** MPI_ERRORS_ARE_FATAL (processes in this communicator
will now abort,
[beryl:12904] *** and potentially your MPI job)
```

HTH. If not, I'd be happy to provide you with anything else that might help.

Best
Joseph
Post by Gilles Gouaillardet
Joseph,
Would you mind trying again with
export OMPI_MCA_osc=^pt2pt
export OMPI_MCA_osc_base_verbose=10
If it still does not work, then please post the output
Cheers,
Gilles
Hi Howard,
Thanks for your quick reply and your suggestions. I exported both
variables as you suggested but neither has any impact. The error
message stays the same with both env variables set. Is there any other
way to get more information from OpenMPI?
Sorry for not mentioning my OS. I'm running on a Linux Mint 18.1 with
stock kernel 4.8.0-36.
Joseph
Hi Joseph
What OS are you using when running the test?
Could you try running with
export OMPI_mca_osc=^pt2pt
and
export OMPI_mca_osc_base_verbose=10
This error message was put in to this OMPI release because this part
of the code has known problems when used multi threaded.
All,
I am seeing a fatal error with OpenMPI 2.0.2 if requesting support for
MPI_THREAD_MULTIPLE and afterwards creating a window using
MPI_Win_create. I am attaching a small reproducer. The output I get is
```
MPI_THREAD_MULTIPLE supported: yes
MPI_THREAD_MULTIPLE supported: yes
MPI_THREAD_MULTIPLE supported: yes
MPI_THREAD_MULTIPLE supported: yes
--------------------------------------------------------------------------
The OSC pt2pt component does not support MPI_THREAD_MULTIPLE in this
release.
Workarounds are to run on a single node, or to use a system with an RDMA
capable network such as Infiniband.
--------------------------------------------------------------------------
[beryl:10705] *** An error occurred in MPI_Win_create
[beryl:10705] *** reported by process [2149974017,2]
[beryl:10705] *** on communicator MPI_COMM_WORLD
[beryl:10705] *** MPI_ERR_WIN: invalid window
[beryl:10705] *** MPI_ERRORS_ARE_FATAL (processes in this
communicator
will now abort,
[beryl:10705] *** and potentially your MPI job)
[beryl:10698] 3 more processes have sent help message
help-osc-pt2pt.txt
/ mpi-thread-multiple-not-supported
[beryl:10698] Set MCA parameter "orte_base_help_aggregate" to 0 to see
all help / error messages
[beryl:10698] 3 more processes have sent help message
help-mpi-errors.txt / mpi_errors_are_fatal
```
I am running on a single node (my laptop). Both OpenMPI and the
application were compiled using GCC 5.3.0. Naturally, there is no
support for Infiniband available. Should I signal OpenMPI that I am
indeed running on a single node? If so, how can I do that? Can't this be
detected by OpenMPI automatically? The test succeeds if I only request
MPI_THREAD_SINGLE.
OpenMPI 2.0.2 has been configured using only
--enable-mpi-thread-multiple and --prefix configure parameters. I am
attaching the output of ompi_info.
Please let me know if you need any additional information.
Cheers,
Joseph
--
Dipl.-Inf. Joseph Schuchart
High Performance Computing Center Stuttgart (HLRS)
Nobelstr. 19
D-70569 Stuttgart
Tel.: +49(0)711-68565890
Fax: +49(0)711-6856832
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
--
Dipl.-Inf. Joseph Schuchart
High Performance Computing Center Stuttgart (HLRS)
Nobelstr. 19
D-70569 Stuttgart
Tel.: +49(0)711-68565890
Fax: +49(0)711-6856832
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
--
Dipl.-Inf. Joseph Schuchart
High Performance Computing Center Stuttgart (HLRS)
Nobelstr. 19
D-70569 Stuttgart

Tel.: +49(0)711-68565890
Fax: +49(0)711-6856832
E-Mail: ***@hlrs.de
Loading...