Hello all,
after the correct configuration, mpirun (v 1.10.2) works fine when all tpc ports are open. I can ssh to all hosts without a password.
Then it comes back to my first question: how to specify the ports for MPI communication?
I opened the ports 40000-50000 for outgoing traffic, when I run:
mpirun --mca btl_tcp_port_min_v4 40040 --mca btl_tcp_port_range_v4 10 --mca oob_tcp_static_ipv4_ports 40020 --host <IP1>,<IP2> hostname
it works, but not every time. Same as when I run mpirun --mca oob_tcp_static_ipv4_ports 40020 --host <IP1>,<IP2> hostname
It is strange that sometimes I can get outputs, sometimes it just hangs. Did I miss something?
Best,
Ping
Von: users [mailto:users-***@open-mpi.org] Im Auftrag von Gilles Gouaillardet
Gesendet: Freitag, 3. Juni 2016 00:14
An: Open MPI Users
Betreff: Re: [OMPI users] Firewall settings for MPI communication
The syntax is
configure --enable-mpirun-prefix-by-default --prefix=<path to OpenMPI> ...
all hosts must be able to ssh each other passwordless.
that means you need to generate a user ssh key pair on all hosts, add your public keys to the list of authorized keys, and ssh to all hosts in order to populate your known hosts
(ssh requires you confirm host public keys the very first time you ssh to a new host)
iirc, that can be automated with ssh-keyscan.
when ssh is fully configured, mpirun should work just fine
Cheers,
Gilles
On Friday, June 3, 2016, Ping Wang < <mailto:***@asc-s.de> ***@asc-s.de> wrote:
Hi,
thank you Gilles for your suggestion. I tried: mpirun --prefix <path to Open MPI> --host <public IP> hostname, then it works.
Iâm sure both IPs are the ones of the VM on which mpirun is running, and they are unique.
I also configured Open MPI with --enable-mpirun-prefix-by-default, but I still need to add --prefix <path to Open MPI> to get mpirun work.
I used: ./configure --enable-mpirun-prefix-by-default ="<path to Open MPI> "
make
make install
Did I miss something or I misunderstood the way to configure Open MPI?
When I run: ssh < internal/public IP > `which orted`
The output is: Warning: Permanently added < internal/public IP > ' (ECDSA) to the list of known hosts.
/usr/local/bin/orted
Is it all right?
Cheers,
Ping
Von: users [mailto: <javascript:_e(%7B%7D,'cvml','users-***@open-mpi.org');> users-***@open-mpi.org] Im Auftrag von Gilles Gouaillardet
Gesendet: Donnerstag, 2. Juni 2016 17:06
An: Open MPI Users
Betreff: Re: [OMPI users] Firewall settings for MPI communication
are you saying both IP are the ones of the VM on which mpirun is running ?
orted is only launched on all the machines *except* the one running mpirun.
can you double/triple check the IPs are ok and unique ?
for example, mpirun --host <internal IP> /sbin/ifconfig -a
can you also make sure Open MPI is installed on all your VMs in the same directory ?
also make sure Open MPI has all the dependencies on all the VMs
ssh xxx ldd `which orted`
should show no missing dependency
generally speaking, I recommend you configure Open MPI with
--enable-mpirun-prefix-by-default
you can also try to replace
mpirun
with
`which mpirun`
or
mpirun --prefix <path to Open MPI>
Cheers,
Gilles
On Thursday, June 2, 2016, Ping Wang < <javascript:_e(%7B%7D,'cvml','***@asc-s.de');> ***@asc-s.de> wrote:
Hi,
I've installed Open MPI v1.10.2. Every VM on the cloud has two IPs (internal IP, public IP).
When I run: mpirun --host <internal IP> hostname, the output is the hostname of the VM.
But when I run: mpirun --host <public IP> hostname, the output is
bash: orted: command not found
--------------------------------------------------------------------------
ORTE was unable to reliably start one or more daemons.
This usually is caused by:
* not finding the required libraries and/or binaries on
one or more nodes. Please check your PATH and LD_LIBRARY_PATH
settings, or configure OMPI with --enable-orterun-prefix-by-default
* lack of authority to execute on one or more specified nodes.
Please verify your allocation and authorities.
* the inability to write startup files into /tmp (--tmpdir/orte_tmpdir_base).
Please check with your sys admin to determine the correct location to use.
* compilation of the orted with dynamic libraries when static are required
(e.g., on Cray). Please check your configure cmd line and consider using
one of the contrib/platform definitions for your system type.
* an inability to create a connection back to mpirun due to a
lack of common network interfaces and/or no route found between
them. Please check network connectivity (including firewalls
and network routing requirements).
Both IPs are the IP of the VM where MPI is running. Did I do something wrong in the configuration?
Thanks for any help.
Ping
-----UrsprÃŒngliche Nachricht-----
Von: users [mailto:users-***@open-mpi.org] Im Auftrag von Jeff Squyres (jsquyres)
Gesendet: Mittwoch, 1. Juni 2016 15:02
An: Open MPI User's List
Betreff: Re: [OMPI users] Firewall settings for MPI communication
In addition, you might want to consider upgrading to Open MPI v1.10.x (v1.6.x is fairly ancient).
Post by Gilles Gouaillardetwhich network are your VMs using for communications ?
if this is tcp, then you also have to specify a restricted set of
allowed ports for the tcp btl
that would be something like
mpirun --mca btl_tcp_dynamic_ports 49990-50010 ...
please double check the Open MPI 1.6.5 parameter and syntax with
ompi_info --all (or check the archives, I think I posted the correct
command line a few weeks ago)
Cheers,
Gilles
I'm using Open MPI 1.6.5 to run OpenFOAM in parallel on several VMs on
a cloud. mpirun hangs without any error messages. I think this is a
firewall issue. Because when I open all the TCP ports(1-65535) in the
security group of VMs, mpirun works well. However I was suggested to
open as less ports as possible. So I have to limit MPI to run on a
range of ports. I opened the port range 49990-50010 for MPI
communication. And use command
mpirun --mca oob_tcp_dynamic_ports 49990-50010 -np 4 --hostfile machines simpleFoam âparallel.
But it still hangs. How can I specify a port range that OpenMPI will use? I appreciate any help you can provide.
Best,
Ping Wang
<image001.png>
------------------------------------------------------
Ping Wang
Automotive Simulation Center Stuttgart e.V.
NobelstraÃe 15
D-70569 Stuttgart
Telefon: +49 711 699659-14
Fax: +49 711 699659-29
Web: <http://www.asc-s.de> http://www.asc-s.de
Social Media: <image002.gif>/asc.stuttgart
------------------------------------------------------
_______________________________________________
users mailing list
Subscription: <https://www.open-mpi.org/mailman/listinfo.cgi/users> https://www.open-mpi.org/mailman/listinfo.cgi/users
<http://www.open-mpi.org/community/lists/users/2016/06/29340.php> http://www.open-mpi.org/community/lists/users/2016/06/29340.php
--
Jeff Squyres
<mailto:***@cisco.com> ***@cisco.com
For corporate legal information go to: <http://www.cisco.com/web/about/doing_business/legal/cri/> http://www.cisco.com/web/about/doing_business/legal/cri/
_______________________________________________
users mailing list
<mailto:***@open-mpi.org> ***@open-mpi.org
Subscription: <https://www.open-mpi.org/mailman/listinfo.cgi/users> https://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: <http://www.open-mpi.org/community/lists/users/2016/06/29342.php> http://www.open-mpi.org/community/lists/users/2016/06/29342.php