Discussion:
[OMPI users] users Digest, Vol 3619, Issue 1
Mahdi, Sam
2016-10-06 20:45:13 UTC
Permalink
Hi Jason,

No my main problem is in regards to running mpirun --np any #. I am unable
to run any command on multiple processors using mpi4py and openmpi
together. Whenever I type in a command that uses both, I receive no output.
I was wondering whether it was a compatibility issue
Send users mailing list submissions to
To subscribe or unsubscribe via the World Wide Web, visit
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
or, via email, send a message with subject or body 'help' to
You can reach the person managing the list at
When replying, please edit your Subject line so it is more specific
than "Re: Contents of users digest..."
1. Re: centos 7.2 openmpi from repo, stdout issue (Emre Brookes)
2. openmpi and mpi4py compatibility (Mahdi, Sam)
3. Re: openmpi and mpi4py compatibility (Jason Maldonis)
4. Re: openmpi and mpi4py compatibility (Lisandro Dalcin)
5. Re: MPI + system() call + Matlab MEX crashes (Gilles Gouaillardet)
6. Re: MPI + system() call + Matlab MEX crashes (Bennet Fauber)
7. Using Open MPI with multiple versions of GCC and G++ (Aditya)
8. Re: Using Open MPI with multiple versions of GCC and G++
(Jeff Squyres (jsquyres))
9. Re: [EXTERNAL] Using Open MPI with multiple versions of GCC
and G++ (Simon Hammond)
10. Crash during MPI_Finalize (George Reeke)
----------------------------------------------------------------------
Message: 1
Date: Wed, 05 Oct 2016 14:00:28 -0500
Subject: Re: [OMPI users] centos 7.2 openmpi from repo, stdout issue
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Thank you for the sanity check and recommendations.
I will post my results here when resolved.
We did have some kind of stdout/stderr truncation issue a little while
ago, but I don't remember what version it specifically affected.
I would definitely update to at least Open MPI 1.10.4 (lots of bug fixes
since 1.10.0). Better would be to update to Open MPI 2.0.1 -- that's the
current generation and where all of our work is going these days.
$ cat /etc/redhat-release
CentOS Linux release 7.2.1511 (Core)
$ yum list installed | grep openmpi
(1) When I run
$ mpirun -H myhosts -np myprocs executable
the job runs fine and outputs correctly to stdout
(2) When I run
$ mpirun -H myhosts -np myprocs executable > stdout.log
The stdout.log file prematurely ends (without full output)
... but the mpi executable itself seems to keep running forever until
manually terminated will a "kill".
(3) When I run
$ mpirun -H myhosts -np myprocs executable | cat > stdout.log
the job runs fine and outputs correctly to the stdout.log file
I tried playing with a 'stdbuf' prefix to the command, but this didn't
seem to help
I would like (2) to work, but have resorted to (3).
I tried digging around in the parameters after seeing
https://github.com/open-mpi/ompi/issues/341
and thinking it might be something similar, but didn't see any poll or
epoll in .conf
I am hesitant to try to compile from scratch and get away from the repo
release cycle.
Is this a known bug?
If so, and if it has been fixed, would you recommend I install the
latest stable rpm of 1.10.4-1 from https://www.open-mpi.org/
software/ompi/v1.10/ ?
Thanks,
Emre
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
------------------------------
Message: 2
Date: Wed, 5 Oct 2016 12:29:28 -0700
Subject: [OMPI users] openmpi and mpi4py compatibility
gmail.com>
Content-Type: text/plain; charset="utf-8"
Hi everyone,
I had a quick question regarding the compatibility of openmpi and mpi4py. I
have openmpi 1.7.3 and mpi4py 1.3.1. I know these are older versions of
each, but I was having some problems running a program that uses mpi4py and
openmpi, and I wanted to make sure it wasn't a compatibility issue between
the 2 versions of these programs.
Sincerely,
Sam
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://rfd.newmexicoconsortium.org/mailman/private/users/
attachments/20161005/aac4f0a4/attachment.html>
------------------------------
Message: 3
Date: Wed, 5 Oct 2016 14:51:04 -0500
Subject: Re: [OMPI users] openmpi and mpi4py compatibility
<CA+LevY+BfrMUknKUZ28RBpc6DUnRPeZ7TCakD
Content-Type: text/plain; charset="utf-8"
Hi Sam,
I am not a developer but I am using mpi4py with openmpi-1.10.2. For that
version, most of the functionality works, but I think there are some issues
with the mpi_spawn commands. Are you using the spawn commands?
I have no experience with the versions you are using, but I thought I'd
chime in just in case you ran into a similar issue as me.
Best,
Jason
Jason Maldonis
Research Assistant of Professor Paul Voyles
Materials Science Grad Student
University of Wisconsin, Madison
1509 University Ave, Rm 202
Madison, WI 53706
Hi everyone,
I had a quick question regarding the compatibility of openmpi and mpi4py.
I have openmpi 1.7.3 and mpi4py 1.3.1. I know these are older versions
of
each, but I was having some problems running a program that uses mpi4py
and
openmpi, and I wanted to make sure it wasn't a compatibility issue
between
the 2 versions of these programs.
Sincerely,
Sam
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://rfd.newmexicoconsortium.org/mailman/private/users/
attachments/20161005/8625aa46/attachment.html>
------------------------------
Message: 4
Date: Wed, 5 Oct 2016 23:24:33 +0300
Subject: Re: [OMPI users] openmpi and mpi4py compatibility
gmail.com>
Content-Type: text/plain; charset="utf-8"
Hi everyone,
I had a quick question regarding the compatibility of openmpi and mpi4py.
I have openmpi 1.7.3 and mpi4py 1.3.1. I know these are older versions
of
each, but I was having some problems running a program that uses mpi4py
and
openmpi, and I wanted to make sure it wasn't a compatibility issue
between
the 2 versions of these programs.
Hi, I'm the author of mpi4py. Could you elaborate on the issues you
experienced? I would start by disabling MPI_Init_threads() from mpi4py, for
import mpi4py.rc
mpi4py.rc.threaded = False
from mpi4py import MPI
But you have to do it at the VERY BEGINNING of your code, more precisely,
the two first lines should be used before any attempt to "from mpi4py
import MPI".
PS: Any chance you can use a newer version of mpi4py, maybe even a git
checkout of the master branch?
--
Lisandro Dalcin
============
Research Scientist
Computer, Electrical and Mathematical Sciences & Engineering (CEMSE)
Extreme Computing Research Center (ECRC)
King Abdullah University of Science and Technology (KAUST)
http://ecrc.kaust.edu.sa/
4700 King Abdullah University of Science and Technology
al-Khawarizmi Bldg (Bldg 1), Office # 0109
Thuwal 23955-6900, Kingdom of Saudi Arabia
http://www.kaust.edu.sa
Office Phone: +966 12 808-0459
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://rfd.newmexicoconsortium.org/mailman/private/users/
attachments/20161005/ef2da0cc/attachment.html>
------------------------------
Message: 5
Date: Thu, 6 Oct 2016 09:03:55 +0900
Subject: Re: [OMPI users] MPI + system() call + Matlab MEX crashes
Content-Type: text/plain; charset="windows-1252"; Format="flowed"
Juraj,
if i understand correctly, the "master" task calls MPI_Init(), and then
fork&exec matlab.
In some cases (lack of hardware support), fork cannot even work. but
let's assume it is fine for now.
Then, if i read between the lines, matlab calls mexFunction that
MPI_Init().
As far as i am concerned, that cannot work.
The blocker is that a child cannot call MPI_Init() if its parent already
called MPI_Init()
Fortunatly, you have some options :-)
1) start matlab from mpirun.
for example, if you want one master, two slaves and matlab, you can do
something like
mpirun -np 1 master : -np 1 matlab : -np 2 slave
2) MPI_Comm_spawn matlab
master can MPI_Comm_spawn() matlab, and then matlab can merge the parent
communicator,
and communicate to master and slaves
3) use the approach suggested by Dmitry
/* this is specific to matlab, and i have no experience with it */
One last point, MPI_Init() can be invoked only once per task
(e.g. if your mexFunction does
MPI_Init(); work(); MPI_Finalize();
then it can be invoked only once per mpirun
Cheers,
Gilles
Hi Juraj,
Although MPI infrastructure may technically support forking, it's
known that not all system resources can correctly replicate themselves
to forked process. For example, forking inside MPI program with active
CUDA driver will result into crash.
Why not to compile down the MATLAB into a native library and link it
https://www.mathworks.com/matlabcentral/answers/98867-
how-do-i-create-a-c-shared-library-from-mex-files-using-
the-matlab-compiler?requestedDomain=www.mathworks.com
Kind regards,
- Dmitry Mikushin.
Hello,
I have an application in C++(main.cpp) that is launched with
multiple processes via mpirun. Master process calls matlab via
system('matlab -nosplash -nodisplay -nojvm -nodesktop -r
"interface"'), which executes simple script interface.m that calls
mexFunction (mexsolve.cpp) from which I try to set up
communication with the rest of the processes launched at the
beginning together with the master process. When I run the
1) crash at MPI_Init() in the mexFunction() on cluster machine
with Linux 4.4.0-22-generic
2) error in MPI_Send() shown below on local machine with
Linux 3.10.0-229.el7.x86_64
[archimedes:31962] shmem: mmap: an error occurred while
determining whether or not
mem_pool.archimedes
could be created.
[archimedes:31962] create_and_attach: unable to create shared
memory BTL coordinating structure :: size 134217728
[archimedes:31962] shmem: mmap: an error occurred while
determining whether or not
segment.archimedes.0
could be created.
[archimedes][[58444,1],0][../../../../../opal/mca/btl/tcp/
btl_tcp_endpoint.c:800:mca_btl_tcp_endpoint_complete_connect]
connect() to <MY_IP> failed: Connection refused (111)
mpirun --mca mpi_warn_on_fork 0 --mca btl_openib_want_fork_support
1 -np 2 -npernode 1 ./main
I have openmpi-2.0.1 configured with --prefix=${INSTALLDIR}
--enable-mpi-fortran=all --with-pmi --disable-dlopen
https://github.com/goghino/matlabMpiC
<https://github.com/goghino/matlabMpiC>
Thanks for any suggestions!
Juraj
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
<https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://rfd.newmexicoconsortium.org/mailman/private/users/
attachments/20161006/341407d4/attachment.html>
------------------------------
Message: 6
Date: Wed, 5 Oct 2016 20:38:25 -0400
Subject: Re: [OMPI users] MPI + system() call + Matlab MEX crashes
gmail.com>
Content-Type: text/plain; charset=UTF-8
Matlab may have its own MPI installed. It definitely does if you have
the parallel computing toolbox. If you have that, it could be causing
problems. If you can, you might consider compiling your Matlab
application into a standalone executable, then call that from your own
program. That bypasses the Matlab user interface and may prove more
tractable See the documentation for mcc if you have that.
http://www.mathworks.com/help/compiler/mcc.html
If you have that toolbox.
-- bennet
Juraj,
if i understand correctly, the "master" task calls MPI_Init(), and then
fork&exec matlab.
In some cases (lack of hardware support), fork cannot even work. but
let's
assume it is fine for now.
Then, if i read between the lines, matlab calls mexFunction that
MPI_Init().
As far as i am concerned, that cannot work.
The blocker is that a child cannot call MPI_Init() if its parent already
called MPI_Init()
Fortunatly, you have some options :-)
1) start matlab from mpirun.
for example, if you want one master, two slaves and matlab, you can do
something like
mpirun -np 1 master : -np 1 matlab : -np 2 slave
2) MPI_Comm_spawn matlab
master can MPI_Comm_spawn() matlab, and then matlab can merge the parent
communicator,
and communicate to master and slaves
3) use the approach suggested by Dmitry
/* this is specific to matlab, and i have no experience with it */
One last point, MPI_Init() can be invoked only once per task
(e.g. if your mexFunction does
MPI_Init(); work(); MPI_Finalize();
then it can be invoked only once per mpirun
Cheers,
Gilles
Hi Juraj,
Although MPI infrastructure may technically support forking, it's known
that
not all system resources can correctly replicate themselves to forked
process. For example, forking inside MPI program with active CUDA driver
will result into crash.
Why not to compile down the MATLAB into a native library and link it with
https://www.mathworks.com/matlabcentral/answers/98867-
how-do-i-create-a-c-shared-library-from-mex-files-using-
the-matlab-compiler?requestedDomain=www.mathworks.com
Kind regards,
- Dmitry Mikushin.
Hello,
I have an application in C++(main.cpp) that is launched with multiple
processes via mpirun. Master process calls matlab via system('matlab
-nosplash -nodisplay -nojvm -nodesktop -r "interface"'), which executes
simple script interface.m that calls mexFunction (mexsolve.cpp) from
which I
try to set up communication with the rest of the processes launched at
the
beginning together with the master process. When I run the application
as
1) crash at MPI_Init() in the mexFunction() on cluster machine with
Linux
4.4.0-22-generic
2) error in MPI_Send() shown below on local machine with Linux
3.10.0-229.el7.x86_64
[archimedes:31962] shmem: mmap: an error occurred while determining
whether or not
mem_pool.archimedes
could be created.
[archimedes:31962] create_and_attach: unable to create shared memory BTL
coordinating structure :: size 134217728
[archimedes:31962] shmem: mmap: an error occurred while determining
whether or not
segment.archimedes.0
could be created.
[archimedes][[58444,1],0][../../../../../opal/mca/btl/tcp/
btl_tcp_endpoint.c:800:mca_btl_tcp_endpoint_complete_connect]
connect() to <MY_IP> failed: Connection refused (111)
mpirun --mca mpi_warn_on_fork 0 --mca btl_openib_want_fork_support 1
-np
2 -npernode 1 ./main
I have openmpi-2.0.1 configured with --prefix=${INSTALLDIR}
--enable-mpi-fortran=all --with-pmi --disable-dlopen
For more details, the code is here: https://github.com/goghino/
matlabMpiC
Thanks for any suggestions!
Juraj
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
------------------------------
Message: 7
Date: Thu, 6 Oct 2016 19:26:15 +0530
Subject: [OMPI users] Using Open MPI with multiple versions of GCC and
G++
gmail.com>
Content-Type: text/plain; charset="utf-8"
Hello,
I'm a senior year Computer Science student working on Parallel Clustering
Algorithms at BITS Pilani, India. I have a few questions about using mpicc
and mpicxx with multiple versions of gcc / g++.
I am using Ubuntu 12.04 equipped with gcc 4.6.4. The currently installed
mpicc is bound to gcc4.6.4. I want mpicc to be bound with gcc-5 that I have
installed in my pc.
Is there a way to do the binding to gcc as a compiler flag or something of
that sort.
PS: Please do reply if you have a solution. I am unable to run a hybrid
code on my pc because of this issue.
Regards,
Aditya.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://rfd.newmexicoconsortium.org/mailman/private/users/
attachments/20161006/e43eb864/attachment.html>
------------------------------
Message: 8
Date: Thu, 6 Oct 2016 14:12:23 +0000
Subject: Re: [OMPI users] Using Open MPI with multiple versions of GCC
and G++
Content-Type: text/plain; charset="us-ascii"
I'm a senior year Computer Science student working on Parallel
Clustering Algorithms at BITS Pilani, India. I have a few questions about
using mpicc and mpicxx with multiple versions of gcc / g++.
I am using Ubuntu 12.04 equipped with gcc 4.6.4. The currently installed
mpicc is bound to gcc4.6.4. I want mpicc to be bound with gcc-5 that I have
installed in my pc.
Is there a way to do the binding to gcc as a compiler flag or something
of that sort.
PS: Please do reply if you have a solution. I am unable to run a hybrid
code on my pc because of this issue.
Especially with C++, the Open MPI team strongly recommends you building
Open MPI with the target versions of the compilers that you want to use.
Unexpected things can happen when you start mixing versions of compilers
(particularly across major versions of a compiler). To be clear: compilers
are *supposed* to be compatible across multiple versions (i.e., compile a
library with one version of the compiler, and then use that library with an
application compiled by a different version of the compiler), but a)
there's other issues, such as C++ ABI issues and other run-time
bootstrapping that can complicate things, and b) bugs in forward and
backward compatibility happen.
The short answer is in this FAQ item: https://www.open-mpi.org/faq/?
category=mpi-apps#override-wrappers-after-v1.0. Substituting the gcc 5
compiler may work just fine.
But the *safer* answer is that you might want to re-build Open MPI with
the specific target compiler.
--
Jeff Squyres
For corporate legal information go to: http://www.cisco.com/web/
about/doing_business/legal/cri/
------------------------------
Message: 9
Date: Thu, 6 Oct 2016 08:15:28 -0600
Subject: Re: [OMPI users] [EXTERNAL] Using Open MPI with multiple
versions of GCC and G++
Content-Type: text/plain; charset="utf-8"
Can you try setting the environment variable OMPI_CXX=<put the path to
gcc-5 here>
mpicxx -v
and see what version it says its running. You may have to be careful
mixing the versions too far apart.
S.
?
Si Hammond
Scalable Computer Architectures
Center for Computing Research
Sandia National Laboratories, NM, USA
Hello,
I'm a senior year Computer Science student working on Parallel
Clustering Algorithms at BITS Pilani, India. I have a few questions about
using mpicc and mpicxx with multiple versions of gcc / g++.
I am using Ubuntu 12.04 equipped with gcc 4.6.4. The currently installed
mpicc is bound to gcc4.6.4. I want mpicc to be bound with gcc-5 that I have
installed in my pc.
Is there a way to do the binding to gcc as a compiler flag or something
of that sort.
PS: Please do reply if you have a solution. I am unable to run a hybrid
code on my pc because of this issue.
Regards,
Aditya.
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
------------------------------
Message: 10
Date: Thu, 06 Oct 2016 13:53:59 -0400
Subject: [OMPI users] Crash during MPI_Finalize
Content-Type: text/plain; charset="iso-8859-1"
Dear colleagues,
I have a parallel MPI application written in C that works normally in
a serial version and in the parallel version in the sense that all
numerical output is correct. When it tries to shut down, it gives the
Primary job terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status,
thus causing
Process name: [[51524,1],0]
Exit code: 13
-----End quoted console text-----
The Process name given is not the number of any Linux process.
The Exit code given seems to be any number in the range 12 to 17.
The core dumps produced do not have usable backtrace information.
There is no output on stderr (besides my debug messages).
The last message written by rank 0 node on stdout and flushed is lost.
I cannot determine the cause of the problem.
OS RHEL 6.8, compiler gcc 4.4.7 with -g, no optimization
Version of MPI (RedHat package): openmpi-1.10-1.10.2-2.el6.x86_64
-n 3 cnsPn < v8tin/dan
cnsP0 is a master code that reads a control file (specified after the
'<' on the command line). The other executables (cnsPn) only send and
receive messages and do math, no file IO. I get same results with
3 or 4 compute nodes.
Early in startup, another process is started via MPI_Comm_spawn.
I suspect this is relevant to the problem, although simple test
programs using the same setup complete normally. This process,
andmsg, receives status or debug information asynchronously via
messages from the other processes and writes them to stderr.
I have tried many versions of the shutdown code, all with the same
result. Here is one version (debug writes (using fwrite()and
/* Everything works OK up to here (stdout and debug output). */
int rc, ival = 0;
/* In next line, NC.dmsgid is rank # of andmsg process and
* NC.commd is intercommunicator to it. andmsg counts these
* shutdown messages, one from each app node. */
rc = MPI_Send(&ival, 1, MPI_INT, NC.dmsgid, SHUTDOWN_ANDMSG,
NC.commd);
/* This message confirms that andmsg got 4 SHUTDOWN messages.
* "is_host(NC.node)" returns 1 if this is the rank 0 node. */
if (is_host(NC.node)) { MPI_Recv(&ival, 1, MPI_INT, NC.dmsgid,
CLOSING_ANDMSG, NC.commd, MPI_STATUS_IGNORE); }
/* Results are similar with or without this barrier. Debug lines
* written on stderr from all nodes after barrier appear OK. */
rc = MPI_Barrier(NC.commc); /* NC.commc is original world comm */
/* Behavior is same with or without this extra message exchange,
* which I added to keep andmsg from terminating before the
* barrier among the other nodes completes. */
if (is_host(NC.node)) { rc = MPI_Send(&ival, 1, MPI_INT,
NC.dmsgid, SHUTDOWN_ANDMSG, NC.commd); }
/* Behavior is same with or without this disconnect */
rc = MPI_Comm_disconnect(&NC.commd);
rc = MPI_Finalize();
exit(0);
if (num2stop <= 0) { /* Countdown of shutdown messages received */
int rc;
/* This message confirms to main app that shutdown messages
* were received from all nodes. */
rc = MPI_Send(&num2stop, 1, MPI_INT, NC.hostid,
CLOSING_ANDMSG, NC.commd);
/* Receive extra synch message commented above */
rc = MPI_Recv(&sdmsg, 1, MPI_INT, NC.hostid, MPI_ANY_TAG,
NC.commd, MPI_STATUS_IGNORE);
sleep(1); /* Results are same with or without this sleep */
/* Results are same with or without this disconnect */
rc = MPI_Comm_disconnect(&NC.commd);
rc = MPI_Finalize();
exit(0);
}
I would much appreciate any suggestions how to debug this.
From the suggestions at the community help web page, here is more
config.log file, bzipped version, is attached.
ompi_info --all bzipped output is attached.
I am not sending information from other nodes or network config--for
test purposes, all processes are running on the one node, my laptop
with i7 processor. I set the "-mca btl_tcp_if_include lo" parameter
earlier when I got an error message about a refused connection
(that my code never asked for in the first place). This got rid
of that error message but application still fails and dumps.
Thanks,
George Reeke
-------------- next part --------------
A non-text attachment was scrubbed...
Name: config.log.bz2
Type: application/x-bzip
Size: 17618 bytes
Desc: not available
URL: <https://rfd.newmexicoconsortium.org/mailman/private/users/
attachments/20161006/d899b19c/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ompi_info.output.bz2
Type: application/x-bzip
Size: 22964 bytes
Desc: not available
URL: <https://rfd.newmexicoconsortium.org/mailman/private/users/
attachments/20161006/d899b19c/attachment-0001.bin>
------------------------------
Subject: Digest Footer
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
------------------------------
End of users Digest, Vol 3619, Issue 1
**************************************
Loading...