Discussion:
[OMPI users] [EXTERNAL] Re: Using shmem_int_fadd() in OpenMPI\'s SHMEM
Benjamin Brock
2017-11-21 19:47:47 UTC
Permalink
What version of Open MPI are you trying to use?
Open MPI 2.1.1-2 as distributed by Arch Linux.
Also, could you describe something about your system.
This is all in shared memory on a MacBook Pro; no networking involved.

The seg fault with the code example above looks like this:

[***@shini kmer_hash]$ g++ minimal.cpp -o minimal `shmemcc --showme:link`
[***@shini kmer_hash]$ !shm
shmemrun -n 2 ./minimal
[shini:08284] *** Process received signal ***
[shini:08284] Signal: Segmentation fault (11)
[shini:08284] Signal code: Address not mapped (1)
[shini:08284] Failing at address: 0x18
[shini:08284] [ 0] /usr/lib/libpthread.so.0(+0x11da0)[0x7f06fb763da0]
[shini:08284] [ 1] /usr/lib/openmpi/openmpi/mca_spml_yoda.so(mca_spml_yoda_
get+0x7da)[0x7f06e0eef0aa]
[shini:08284] [ 2] /usr/lib/openmpi/openmpi/mca_
atomic_basic.so(atomic_basic_lock+0xb2)[0x7f06e08d90d2]
[shini:08284] [ 3] /usr/lib/openmpi/openmpi/mca_atomic_basic.so(mca_atomic_
basic_fadd+0x4a)[0x7f06e08d949a]
[shini:08284] [ 4] /usr/lib/openmpi/liboshmem.so.20(shmem_int_fadd+0x90)[
0x7f06fc5a7660]
[shini:08284] [ 5] ./minimal(+0x94f)[0x55a5cde7e94f]
[shini:08284] [ 6] /usr/lib/libc.so.6(__libc_start_main+0xea)[
0x7f06fb3baf6a]
[shini:08284] [ 7] ./minimal(+0x80a)[0x55a5cde7e80a]
[shini:08284] *** End of error message ***
--------------------------------------------------------------------------
shmemrun noticed that process rank 1 with PID 0 on node shini exited on
signal 11 (Segmentation fault).
--------------------------------------------------------------------------

Cheers,

Ben
Howard Pritchard
2017-11-22 22:21:18 UTC
Permalink
HI Ben,

Even on one box, the yoda component doesn't work any more.

If you want to do OpenSHMEM programming on you Macbook pro (like I do)
and you don't want to set up a VM to use UCX, then you can use
Sandia OpenSHMEM implementation.

https://github.com/Sandia-OpenSHMEM/SOS

You will need to install the MPICH hydra launcher

http://www.mpich.org/downloads/versions/

as the SOS needs that for its oshrun launcher.

I use hydra-3.2 on my mac with SOS.

You will also need to install OFI libfabric:

https://github.com/ofiwg/libfabric

I'd suggest installing the OFI 1.5.1 tarball. OFI is also available via
brew
but its so old that I doubt it will work with recent versions of SOS.

If you'd like to use UCX, you'll need to install it and Open MPI on a VM
running a linux distro.

Howard
Post by Benjamin Brock
What version of Open MPI are you trying to use?
Open MPI 2.1.1-2 as distributed by Arch Linux.
Also, could you describe something about your system.
This is all in shared memory on a MacBook Pro; no networking involved.
shmemrun -n 2 ./minimal
[shini:08284] *** Process received signal ***
[shini:08284] Signal: Segmentation fault (11)
[shini:08284] Signal code: Address not mapped (1)
[shini:08284] Failing at address: 0x18
[shini:08284] [ 0] /usr/lib/libpthread.so.0(+0x11da0)[0x7f06fb763da0]
[shini:08284] [ 1] /usr/lib/openmpi/openmpi/mca_s
pml_yoda.so(mca_spml_yoda_get+0x7da)[0x7f06e0eef0aa]
[shini:08284] [ 2] /usr/lib/openmpi/openmpi/mca_a
tomic_basic.so(atomic_basic_lock+0xb2)[0x7f06e08d90d2]
[shini:08284] [ 3] /usr/lib/openmpi/openmpi/mca_a
tomic_basic.so(mca_atomic_basic_fadd+0x4a)[0x7f06e08d949a]
[shini:08284] [ 4] /usr/lib/openmpi/liboshmem.so.
20(shmem_int_fadd+0x90)[0x7f06fc5a7660]
[shini:08284] [ 5] ./minimal(+0x94f)[0x55a5cde7e94f]
[shini:08284] [ 6] /usr/lib/libc.so.6(__libc_star
t_main+0xea)[0x7f06fb3baf6a]
[shini:08284] [ 7] ./minimal(+0x80a)[0x55a5cde7e80a]
[shini:08284] *** End of error message ***
--------------------------------------------------------------------------
shmemrun noticed that process rank 1 with PID 0 on node shini exited on
signal 11 (Segmentation fault).
--------------------------------------------------------------------------
Cheers,
Ben
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Howard Pritchard
2017-11-22 22:32:22 UTC
Permalink
Hi Ben,

Actually I did some checking about the brew install for OFi libfabric.
It looks like if your brew is up to date, it will pick up libfabric 1.5.2.

Howard
Post by Howard Pritchard
HI Ben,
Even on one box, the yoda component doesn't work any more.
If you want to do OpenSHMEM programming on you Macbook pro (like I do)
and you don't want to set up a VM to use UCX, then you can use
Sandia OpenSHMEM implementation.
https://github.com/Sandia-OpenSHMEM/SOS
You will need to install the MPICH hydra launcher
http://www.mpich.org/downloads/versions/
as the SOS needs that for its oshrun launcher.
I use hydra-3.2 on my mac with SOS.
https://github.com/ofiwg/libfabric
I'd suggest installing the OFI 1.5.1 tarball. OFI is also available via
brew
but its so old that I doubt it will work with recent versions of SOS.
If you'd like to use UCX, you'll need to install it and Open MPI on a VM
running a linux distro.
Howard
Post by Benjamin Brock
What version of Open MPI are you trying to use?
Open MPI 2.1.1-2 as distributed by Arch Linux.
Also, could you describe something about your system.
This is all in shared memory on a MacBook Pro; no networking involved.
--showme:link`
shmemrun -n 2 ./minimal
[shini:08284] *** Process received signal ***
[shini:08284] Signal: Segmentation fault (11)
[shini:08284] Signal code: Address not mapped (1)
[shini:08284] Failing at address: 0x18
[shini:08284] [ 0] /usr/lib/libpthread.so.0(+0x11da0)[0x7f06fb763da0]
[shini:08284] [ 1] /usr/lib/openmpi/openmpi/mca_s
pml_yoda.so(mca_spml_yoda_get+0x7da)[0x7f06e0eef0aa]
[shini:08284] [ 2] /usr/lib/openmpi/openmpi/mca_a
tomic_basic.so(atomic_basic_lock+0xb2)[0x7f06e08d90d2]
[shini:08284] [ 3] /usr/lib/openmpi/openmpi/mca_a
tomic_basic.so(mca_atomic_basic_fadd+0x4a)[0x7f06e08d949a]
[shini:08284] [ 4] /usr/lib/openmpi/liboshmem.so.
20(shmem_int_fadd+0x90)[0x7f06fc5a7660]
[shini:08284] [ 5] ./minimal(+0x94f)[0x55a5cde7e94f]
[shini:08284] [ 6] /usr/lib/libc.so.6(__libc_star
t_main+0xea)[0x7f06fb3baf6a]
[shini:08284] [ 7] ./minimal(+0x80a)[0x55a5cde7e80a]
[shini:08284] *** End of error message ***
------------------------------------------------------------
--------------
shmemrun noticed that process rank 1 with PID 0 on node shini exited on
signal 11 (Segmentation fault).
------------------------------------------------------------
--------------
Cheers,
Ben
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Loading...