Discussion:
[OMPI users] Firewall settings for MPI communication
Ping Wang
2016-06-01 11:35:38 UTC
Permalink
I'm using Open MPI 1.6.5 to run OpenFOAM in parallel on several VMs on a cloud. mpirun hangs without any error messages. I think this is a firewall issue. Because when I open all the TCP ports(1-65535) in the security group of VMs, mpirun works well. However I was suggested to open as less ports as possible. So I have to limit MPI to run on a range of ports. I opened the port range 49990-50010 for MPI communication. And use command



mpirun --mca oob_tcp_dynamic_ports 49990-50010 -np 4 --hostfile machines simpleFoam -parallel.



But it still hangs. How can I specify a port range that OpenMPI will use? I appreciate any help you can provide.



Best,

Ping Wang



ascs_logo_300dpi

------------------------------------------------------

Ping Wang

Automotive Simulation Center Stuttgart e.V.

Nobelstraße 15

D-70569 Stuttgart

Telefon: +49 711 699659-14

Fax: +49 711 699659-29

E-Mail: <mailto:***@asc-s.de> ***@asc-s.de

Web: <http://www.asc-s.de/> http://www.asc-s.de

Social Media: <http://www.facebook.com/asc.stuttgart> facebook/asc.stuttgart

------------------------------------------------------
Gilles Gouaillardet
2016-06-01 11:46:41 UTC
Permalink
which network are your VMs using for communications ?
if this is tcp, then you also have to specify a restricted set of allowed
ports for the tcp btl

that would be something like
mpirun --mca btl_tcp_dynamic_ports 49990-50010 ...

please double check the Open MPI 1.6.5 parameter and syntax with
ompi_info --all
(or check the archives, I think I posted the correct command line a few
weeks ago)

Cheers,

Gilles
Post by Ping Wang
I'm using Open MPI 1.6.5 to run OpenFOAM in parallel on several VMs on a
cloud. mpirun hangs without any error messages. I think this is a firewall
issue. Because when I open all the TCP ports(1-65535) in the security group
of VMs, mpirun works well. However I was suggested to open as less ports as
possible. So I have to limit MPI to run on a range of ports. I opened the
port range 49990-50010 for MPI communication. And use command
mpirun --mca oob_tcp_dynamic_ports 49990-50010 -np 4 --hostfile machines
simpleFoam –parallel.
But it still hangs. How can I specify a port range that OpenMPI will use?
I appreciate any help you can provide.
Best,
Ping Wang
[image: ascs_logo_300dpi]
*------------------------------------------------------*
Ping Wang
*Automotive Simulation Center Stuttgart e.V.*
Nobelstraße 15
D-70569 Stuttgart
Telefon: +49 711 699659-14
Fax: +49 711 699659-29
Web: http://www.asc-s.de
Social Media: [image: facebook]/asc.stuttgart
<http://www.facebook.com/asc.stuttgart>
*------------------------------------------------------*
Jeff Squyres (jsquyres)
2016-06-01 13:02:22 UTC
Permalink
In addition, you might want to consider upgrading to Open MPI v1.10.x (v1.6.x is fairly ancient).
Post by Gilles Gouaillardet
which network are your VMs using for communications ?
if this is tcp, then you also have to specify a restricted set of allowed ports for the tcp btl
that would be something like
mpirun --mca btl_tcp_dynamic_ports 49990-50010 ...
please double check the Open MPI 1.6.5 parameter and syntax with
ompi_info --all
(or check the archives, I think I posted the correct command line a few weeks ago)
Cheers,
Gilles
I'm using Open MPI 1.6.5 to run OpenFOAM in parallel on several VMs on a cloud. mpirun hangs without any error messages. I think this is a firewall issue. Because when I open all the TCP ports(1-65535) in the security group of VMs, mpirun works well. However I was suggested to open as less ports as possible. So I have to limit MPI to run on a range of ports. I opened the port range 49990-50010 for MPI communication. And use command
mpirun --mca oob_tcp_dynamic_ports 49990-50010 -np 4 --hostfile machines simpleFoam –parallel.
But it still hangs. How can I specify a port range that OpenMPI will use? I appreciate any help you can provide.
Best,
Ping Wang
<image001.png>
------------------------------------------------------
Ping Wang
Automotive Simulation Center Stuttgart e.V.
Nobelstraße 15
D-70569 Stuttgart
Telefon: +49 711 699659-14
Fax: +49 711 699659-29
Web: http://www.asc-s.de
Social Media: <image002.gif>/asc.stuttgart
------------------------------------------------------
_______________________________________________
users mailing list
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: http://www.open-mpi.org/community/lists/users/2016/06/29340.php
--
Jeff Squyres
***@cisco.com
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Ping Wang
2016-06-02 14:44:26 UTC
Permalink
Hi,

I've installed Open MPI v1.10.2. Every VM on the cloud has two IPs (internal IP, public IP).
When I run: mpirun --host <internal IP> hostname, the output is the hostname of the VM.
But when I run: mpirun --host <public IP> hostname, the output is

bash: orted: command not found
--------------------------------------------------------------------------
ORTE was unable to reliably start one or more daemons.
This usually is caused by:

* not finding the required libraries and/or binaries on
one or more nodes. Please check your PATH and LD_LIBRARY_PATH
settings, or configure OMPI with --enable-orterun-prefix-by-default

* lack of authority to execute on one or more specified nodes.
Please verify your allocation and authorities.

* the inability to write startup files into /tmp (--tmpdir/orte_tmpdir_base).
Please check with your sys admin to determine the correct location to use.

* compilation of the orted with dynamic libraries when static are required
(e.g., on Cray). Please check your configure cmd line and consider using
one of the contrib/platform definitions for your system type.

* an inability to create a connection back to mpirun due to a
lack of common network interfaces and/or no route found between
them. Please check network connectivity (including firewalls
and network routing requirements).

Both IPs are the IP of the VM where MPI is running. Did I do something wrong in the configuration?

Thanks for any help.

Ping

-----Ursprüngliche Nachricht-----
Von: users [mailto:users-***@open-mpi.org] Im Auftrag von Jeff Squyres (jsquyres)
Gesendet: Mittwoch, 1. Juni 2016 15:02
An: Open MPI User's List
Betreff: Re: [OMPI users] Firewall settings for MPI communication

In addition, you might want to consider upgrading to Open MPI v1.10.x (v1.6.x is fairly ancient).
Post by Gilles Gouaillardet
which network are your VMs using for communications ?
if this is tcp, then you also have to specify a restricted set of
allowed ports for the tcp btl
that would be something like
mpirun --mca btl_tcp_dynamic_ports 49990-50010 ...
please double check the Open MPI 1.6.5 parameter and syntax with
ompi_info --all (or check the archives, I think I posted the correct
command line a few weeks ago)
Cheers,
Gilles
I'm using Open MPI 1.6.5 to run OpenFOAM in parallel on several VMs on
a cloud. mpirun hangs without any error messages. I think this is a
firewall issue. Because when I open all the TCP ports(1-65535) in the
security group of VMs, mpirun works well. However I was suggested to
open as less ports as possible. So I have to limit MPI to run on a
range of ports. I opened the port range 49990-50010 for MPI
communication. And use command
mpirun --mca oob_tcp_dynamic_ports 49990-50010 -np 4 --hostfile machines simpleFoam –parallel.
But it still hangs. How can I specify a port range that OpenMPI will use? I appreciate any help you can provide.
Best,
Ping Wang
<image001.png>
------------------------------------------------------
Ping Wang
Automotive Simulation Center Stuttgart e.V.
Nobelstraße 15
D-70569 Stuttgart
Telefon: +49 711 699659-14
Fax: +49 711 699659-29
Web: http://www.asc-s.de
Social Media: <image002.gif>/asc.stuttgart
------------------------------------------------------
_______________________________________________
users mailing list
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
http://www.open-mpi.org/community/lists/users/2016/06/29340.php
--
Jeff Squyres
***@cisco.com
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/

_______________________________________________
users mailing list
***@open-mpi.org
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: http://www.open-mpi.org/community/lists/users/2016/06/29342.php
Ralph Castain
2016-06-02 14:58:01 UTC
Permalink
Possibly - did you configure —enable-orterun-prefix-by-default as the error message suggests?
Post by Ping Wang
Hi,
I've installed Open MPI v1.10.2. Every VM on the cloud has two IPs (internal IP, public IP).
When I run: mpirun --host <internal IP> hostname, the output is the hostname of the VM.
But when I run: mpirun --host <public IP> hostname, the output is
bash: orted: command not found
--------------------------------------------------------------------------
ORTE was unable to reliably start one or more daemons.
* not finding the required libraries and/or binaries on
one or more nodes. Please check your PATH and LD_LIBRARY_PATH
settings, or configure OMPI with --enable-orterun-prefix-by-default
* lack of authority to execute on one or more specified nodes.
Please verify your allocation and authorities.
* the inability to write startup files into /tmp (--tmpdir/orte_tmpdir_base).
Please check with your sys admin to determine the correct location to use.
* compilation of the orted with dynamic libraries when static are required
(e.g., on Cray). Please check your configure cmd line and consider using
one of the contrib/platform definitions for your system type.
* an inability to create a connection back to mpirun due to a
lack of common network interfaces and/or no route found between
them. Please check network connectivity (including firewalls
and network routing requirements).
Both IPs are the IP of the VM where MPI is running. Did I do something wrong in the configuration?
Thanks for any help.
Ping
-----Ursprüngliche Nachricht-----
Gesendet: Mittwoch, 1. Juni 2016 15:02
An: Open MPI User's List
Betreff: Re: [OMPI users] Firewall settings for MPI communication
In addition, you might want to consider upgrading to Open MPI v1.10.x (v1.6.x is fairly ancient).
Post by Gilles Gouaillardet
which network are your VMs using for communications ?
if this is tcp, then you also have to specify a restricted set of
allowed ports for the tcp btl
that would be something like
mpirun --mca btl_tcp_dynamic_ports 49990-50010 ...
please double check the Open MPI 1.6.5 parameter and syntax with
ompi_info --all (or check the archives, I think I posted the correct
command line a few weeks ago)
Cheers,
Gilles
I'm using Open MPI 1.6.5 to run OpenFOAM in parallel on several VMs on
a cloud. mpirun hangs without any error messages. I think this is a
firewall issue. Because when I open all the TCP ports(1-65535) in the
security group of VMs, mpirun works well. However I was suggested to
open as less ports as possible. So I have to limit MPI to run on a
range of ports. I opened the port range 49990-50010 for MPI
communication. And use command
mpirun --mca oob_tcp_dynamic_ports 49990-50010 -np 4 --hostfile machines simpleFoam –parallel.
But it still hangs. How can I specify a port range that OpenMPI will use? I appreciate any help you can provide.
Best,
Ping Wang
<image001.png>
------------------------------------------------------
Ping Wang
Automotive Simulation Center Stuttgart e.V.
Nobelstraße 15
D-70569 Stuttgart
Telefon: +49 711 699659-14
Fax: +49 711 699659-29
Web: http://www.asc-s.de
Social Media: <image002.gif>/asc.stuttgart
------------------------------------------------------
_______________________________________________
users mailing list
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
http://www.open-mpi.org/community/lists/users/2016/06/29340.php
--
Jeff Squyres
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
_______________________________________________
users mailing list
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: http://www.open-mpi.org/community/lists/users/2016/06/29342.php
_______________________________________________
users mailing list
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: http://www.open-mpi.org/community/lists/users/2016/06/29349.php
Gilles Gouaillardet
2016-06-02 15:06:24 UTC
Permalink
are you saying both IP are the ones of the VM on which mpirun is running ?
orted is only launched on all the machines *except* the one running mpirun.

can you double/triple check the IPs are ok and unique ?
for example, mpirun --host <internal IP> /sbin/ifconfig -a
can you also make sure Open MPI is installed on all your VMs in the same
directory ?
also make sure Open MPI has all the dependencies on all the VMs
ssh xxx ldd `which orted`
should show no missing dependency

generally speaking, I recommend you configure Open MPI with
--enable-mpirun-prefix-by-default

you can also try to replace
mpirun
with
`which mpirun`
or
mpirun --prefix <path to Open MPI>

Cheers,

Gilles
Post by Ping Wang
Hi,
I've installed Open MPI v1.10.2. Every VM on the cloud has two IPs
(internal IP, public IP).
When I run: mpirun --host <internal IP> hostname, the output is the hostname of the VM.
But when I run: mpirun --host <public IP> hostname, the output is
bash: orted: command not found
--------------------------------------------------------------------------
ORTE was unable to reliably start one or more daemons.
* not finding the required libraries and/or binaries on
one or more nodes. Please check your PATH and LD_LIBRARY_PATH
settings, or configure OMPI with --enable-orterun-prefix-by-default
* lack of authority to execute on one or more specified nodes.
Please verify your allocation and authorities.
* the inability to write startup files into /tmp
(--tmpdir/orte_tmpdir_base).
Please check with your sys admin to determine the correct location to use.
* compilation of the orted with dynamic libraries when static are required
(e.g., on Cray). Please check your configure cmd line and consider using
one of the contrib/platform definitions for your system type.
* an inability to create a connection back to mpirun due to a
lack of common network interfaces and/or no route found between
them. Please check network connectivity (including firewalls
and network routing requirements).
Both IPs are the IP of the VM where MPI is running. Did I do something
wrong in the configuration?
Thanks for any help.
Ping
-----UrsprÃŒngliche Nachricht-----
von Jeff Squyres (jsquyres)
Gesendet: Mittwoch, 1. Juni 2016 15:02
An: Open MPI User's List
Betreff: Re: [OMPI users] Firewall settings for MPI communication
In addition, you might want to consider upgrading to Open MPI v1.10.x
(v1.6.x is fairly ancient).
On Jun 1, 2016, at 7:46 AM, Gilles Gouaillardet <
which network are your VMs using for communications ?
if this is tcp, then you also have to specify a restricted set of
allowed ports for the tcp btl
that would be something like
mpirun --mca btl_tcp_dynamic_ports 49990-50010 ...
please double check the Open MPI 1.6.5 parameter and syntax with
ompi_info --all (or check the archives, I think I posted the correct
command line a few weeks ago)
Cheers,
Gilles
I'm using Open MPI 1.6.5 to run OpenFOAM in parallel on several VMs on
a cloud. mpirun hangs without any error messages. I think this is a
firewall issue. Because when I open all the TCP ports(1-65535) in the
security group of VMs, mpirun works well. However I was suggested to
open as less ports as possible. So I have to limit MPI to run on a
range of ports. I opened the port range 49990-50010 for MPI
communication. And use command
mpirun --mca oob_tcp_dynamic_ports 49990-50010 -np 4 --hostfile machines
simpleFoam –parallel.
But it still hangs. How can I specify a port range that OpenMPI will
use? I appreciate any help you can provide.
Best,
Ping Wang
<image001.png>
------------------------------------------------------
Ping Wang
Automotive Simulation Center Stuttgart e.V.
Nobelstraße 15
D-70569 Stuttgart
Telefon: +49 711 699659-14
Fax: +49 711 699659-29
Web: http://www.asc-s.de
Social Media: <image002.gif>/asc.stuttgart
------------------------------------------------------
_______________________________________________
users mailing list
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
http://www.open-mpi.org/community/lists/users/2016/06/29340.php
--
Jeff Squyres
http://www.cisco.com/web/about/doing_business/legal/cri/
_______________________________________________
users mailing list
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
http://www.open-mpi.org/community/lists/users/2016/06/29342.php
_______________________________________________
users mailing list
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
http://www.open-mpi.org/community/lists/users/2016/06/29349.php
Ping Wang
2016-06-02 16:35:18 UTC
Permalink
Hi,



thank you Gilles for your suggestion. I tried: mpirun --prefix <path to Open MPI> --host <public IP> hostname, then it works.

I’m sure both IPs are the ones of the VM on which mpirun is running, and they are unique.



I also configured Open MPI with --enable-mpirun-prefix-by-default, but I still need to add --prefix <path to Open MPI> to get mpirun work.

I used: ./configure --enable-mpirun-prefix-by-default ="<path to Open MPI> "
make
make install

Did I miss something or I misunderstood the way to configure Open MPI?



When I run: ssh < internal/public IP > `which orted`

The output is: Warning: Permanently added < internal/public IP > ' (ECDSA) to the list of known hosts.
/usr/local/bin/orted

Is it all right?



Cheers,

Ping





Von: users [mailto:users-***@open-mpi.org] Im Auftrag von Gilles Gouaillardet
Gesendet: Donnerstag, 2. Juni 2016 17:06
An: Open MPI Users
Betreff: Re: [OMPI users] Firewall settings for MPI communication



are you saying both IP are the ones of the VM on which mpirun is running ?

orted is only launched on all the machines *except* the one running mpirun.



can you double/triple check the IPs are ok and unique ?

for example, mpirun --host <internal IP> /sbin/ifconfig -a

can you also make sure Open MPI is installed on all your VMs in the same directory ?

also make sure Open MPI has all the dependencies on all the VMs

ssh xxx ldd `which orted`

should show no missing dependency



generally speaking, I recommend you configure Open MPI with

--enable-mpirun-prefix-by-default



you can also try to replace

mpirun

with

`which mpirun`

or

mpirun --prefix <path to Open MPI>



Cheers,



Gilles

On Thursday, June 2, 2016, Ping Wang <***@asc-s.de> wrote:

Hi,

I've installed Open MPI v1.10.2. Every VM on the cloud has two IPs (internal IP, public IP).
When I run: mpirun --host <internal IP> hostname, the output is the hostname of the VM.
But when I run: mpirun --host <public IP> hostname, the output is

bash: orted: command not found
--------------------------------------------------------------------------
ORTE was unable to reliably start one or more daemons.
This usually is caused by:

* not finding the required libraries and/or binaries on
one or more nodes. Please check your PATH and LD_LIBRARY_PATH
settings, or configure OMPI with --enable-orterun-prefix-by-default

* lack of authority to execute on one or more specified nodes.
Please verify your allocation and authorities.

* the inability to write startup files into /tmp (--tmpdir/orte_tmpdir_base).
Please check with your sys admin to determine the correct location to use.

* compilation of the orted with dynamic libraries when static are required
(e.g., on Cray). Please check your configure cmd line and consider using
one of the contrib/platform definitions for your system type.

* an inability to create a connection back to mpirun due to a
lack of common network interfaces and/or no route found between
them. Please check network connectivity (including firewalls
and network routing requirements).

Both IPs are the IP of the VM where MPI is running. Did I do something wrong in the configuration?

Thanks for any help.

Ping

-----UrsprÃŒngliche Nachricht-----
Von: users [mailto:users-***@open-mpi.org <javascript:;> ] Im Auftrag von Jeff Squyres (jsquyres)
Gesendet: Mittwoch, 1. Juni 2016 15:02
An: Open MPI User's List
Betreff: Re: [OMPI users] Firewall settings for MPI communication

In addition, you might want to consider upgrading to Open MPI v1.10.x (v1.6.x is fairly ancient).
Post by Gilles Gouaillardet
which network are your VMs using for communications ?
if this is tcp, then you also have to specify a restricted set of
allowed ports for the tcp btl
that would be something like
mpirun --mca btl_tcp_dynamic_ports 49990-50010 ...
please double check the Open MPI 1.6.5 parameter and syntax with
ompi_info --all (or check the archives, I think I posted the correct
command line a few weeks ago)
Cheers,
Gilles
I'm using Open MPI 1.6.5 to run OpenFOAM in parallel on several VMs on
a cloud. mpirun hangs without any error messages. I think this is a
firewall issue. Because when I open all the TCP ports(1-65535) in the
security group of VMs, mpirun works well. However I was suggested to
open as less ports as possible. So I have to limit MPI to run on a
range of ports. I opened the port range 49990-50010 for MPI
communication. And use command
mpirun --mca oob_tcp_dynamic_ports 49990-50010 -np 4 --hostfile machines simpleFoam –parallel.
But it still hangs. How can I specify a port range that OpenMPI will use? I appreciate any help you can provide.
Best,
Ping Wang
<image001.png>
------------------------------------------------------
Ping Wang
Automotive Simulation Center Stuttgart e.V.
Nobelstraße 15
D-70569 Stuttgart
Telefon: +49 711 699659-14
Fax: +49 711 699659-29
Web: http://www.asc-s.de
Social Media: <image002.gif>/asc.stuttgart
------------------------------------------------------
_______________________________________________
users mailing list
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
http://www.open-mpi.org/community/lists/users/2016/06/29340.php
--
Jeff Squyres
***@cisco.com <javascript:;>
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/

_______________________________________________
users mailing list
***@open-mpi.org <javascript:;>
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: http://www.open-mpi.org/community/lists/users/2016/06/29342.php
Gilles Gouaillardet
2016-06-02 22:13:46 UTC
Permalink
The syntax is
configure --enable-mpirun-prefix-by-default --prefix=<path to OpenMPI> ...

all hosts must be able to ssh each other passwordless.
that means you need to generate a user ssh key pair on all hosts, add your
public keys to the list of authorized keys, and ssh to all hosts in order
to populate your known hosts
(ssh requires you confirm host public keys the very first time you ssh to a
new host)
iirc, that can be automated with ssh-keyscan.

when ssh is fully configured, mpirun should work just fine

Cheers,

Gilles
Post by Ping Wang
Hi,
thank you Gilles for your suggestion. I tried: mpirun --prefix <path to
Open MPI> --host <public IP> hostname, then it works.
I’m sure both IPs are the ones of the VM on which mpirun is running, and
they are unique.
I also configured Open MPI with --enable-mpirun-prefix-by-default, but I
still need to add --prefix <path to Open MPI> to get mpirun work.
I used: ./configure --enable-mpirun-prefix-by-default ="<path to Open MPI> "
make
make install
Did I miss something or I misunderstood the way to configure Open MPI?
When I run: ssh < internal/public IP > `which orted`
The output is: Warning: Permanently added < internal/public IP > '
(ECDSA) to the list of known hosts.
/usr/local/bin/orted
Is it all right?
Cheers,
Ping
von *Gilles Gouaillardet
*Gesendet:* Donnerstag, 2. Juni 2016 17:06
*An:* Open MPI Users
*Betreff:* Re: [OMPI users] Firewall settings for MPI communication
are you saying both IP are the ones of the VM on which mpirun is running ?
orted is only launched on all the machines *except* the one running mpirun.
can you double/triple check the IPs are ok and unique ?
for example, mpirun --host <internal IP> /sbin/ifconfig -a
can you also make sure Open MPI is installed on all your VMs in the same directory ?
also make sure Open MPI has all the dependencies on all the VMs
ssh xxx ldd `which orted`
should show no missing dependency
generally speaking, I recommend you configure Open MPI with
--enable-mpirun-prefix-by-default
you can also try to replace
mpirun
with
`which mpirun`
or
mpirun --prefix <path to Open MPI>
Cheers,
Gilles
Hi,
I've installed Open MPI v1.10.2. Every VM on the cloud has two IPs
(internal IP, public IP).
When I run: mpirun --host <internal IP> hostname, the output is the hostname of the VM.
But when I run: mpirun --host <public IP> hostname, the output is
bash: orted: command not found
--------------------------------------------------------------------------
ORTE was unable to reliably start one or more daemons.
* not finding the required libraries and/or binaries on
one or more nodes. Please check your PATH and LD_LIBRARY_PATH
settings, or configure OMPI with --enable-orterun-prefix-by-default
* lack of authority to execute on one or more specified nodes.
Please verify your allocation and authorities.
* the inability to write startup files into /tmp
(--tmpdir/orte_tmpdir_base).
Please check with your sys admin to determine the correct location to use.
* compilation of the orted with dynamic libraries when static are required
(e.g., on Cray). Please check your configure cmd line and consider using
one of the contrib/platform definitions for your system type.
* an inability to create a connection back to mpirun due to a
lack of common network interfaces and/or no route found between
them. Please check network connectivity (including firewalls
and network routing requirements).
Both IPs are the IP of the VM where MPI is running. Did I do something
wrong in the configuration?
Thanks for any help.
Ping
-----UrsprÃŒngliche Nachricht-----
Squyres (jsquyres)
Gesendet: Mittwoch, 1. Juni 2016 15:02
An: Open MPI User's List
Betreff: Re: [OMPI users] Firewall settings for MPI communication
In addition, you might want to consider upgrading to Open MPI v1.10.x
(v1.6.x is fairly ancient).
On Jun 1, 2016, at 7:46 AM, Gilles Gouaillardet <
which network are your VMs using for communications ?
if this is tcp, then you also have to specify a restricted set of
allowed ports for the tcp btl
that would be something like
mpirun --mca btl_tcp_dynamic_ports 49990-50010 ...
please double check the Open MPI 1.6.5 parameter and syntax with
ompi_info --all (or check the archives, I think I posted the correct
command line a few weeks ago)
Cheers,
Gilles
I'm using Open MPI 1.6.5 to run OpenFOAM in parallel on several VMs on
a cloud. mpirun hangs without any error messages. I think this is a
firewall issue. Because when I open all the TCP ports(1-65535) in the
security group of VMs, mpirun works well. However I was suggested to
open as less ports as possible. So I have to limit MPI to run on a
range of ports. I opened the port range 49990-50010 for MPI
communication. And use command
mpirun --mca oob_tcp_dynamic_ports 49990-50010 -np 4 --hostfile machines
simpleFoam –parallel.
But it still hangs. How can I specify a port range that OpenMPI will
use? I appreciate any help you can provide.
Best,
Ping Wang
<image001.png>
------------------------------------------------------
Ping Wang
Automotive Simulation Center Stuttgart e.V.
Nobelstraße 15
D-70569 Stuttgart
Telefon: +49 711 699659-14
Fax: +49 711 699659-29
Web: http://www.asc-s.de
Social Media: <image002.gif>/asc.stuttgart
------------------------------------------------------
_______________________________________________
users mailing list
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
http://www.open-mpi.org/community/lists/users/2016/06/29340.php
--
Jeff Squyres
http://www.cisco.com/web/about/doing_business/legal/cri/
_______________________________________________
users mailing list
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
http://www.open-mpi.org/community/lists/users/2016/06/29342.php
_______________________________________________
users mailing list
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
http://www.open-mpi.org/community/lists/users/2016/06/29349.php
Ping Wang
2016-06-07 10:57:55 UTC
Permalink
Hello all,



after the correct configuration, mpirun (v 1.10.2) works fine when all tpc ports are open. I can ssh to all hosts without a password.

Then it comes back to my first question: how to specify the ports for MPI communication?

I opened the ports 40000-50000 for outgoing traffic, when I run:

mpirun --mca btl_tcp_port_min_v4 40040 --mca btl_tcp_port_range_v4 10 --mca oob_tcp_static_ipv4_ports 40020 --host <IP1>,<IP2> hostname

it works, but not every time. Same as when I run mpirun --mca oob_tcp_static_ipv4_ports 40020 --host <IP1>,<IP2> hostname

It is strange that sometimes I can get outputs, sometimes it just hangs. Did I miss something?



Best,

Ping





Von: users [mailto:users-***@open-mpi.org] Im Auftrag von Gilles Gouaillardet
Gesendet: Freitag, 3. Juni 2016 00:14
An: Open MPI Users
Betreff: Re: [OMPI users] Firewall settings for MPI communication



The syntax is

configure --enable-mpirun-prefix-by-default --prefix=<path to OpenMPI> ...



all hosts must be able to ssh each other passwordless.

that means you need to generate a user ssh key pair on all hosts, add your public keys to the list of authorized keys, and ssh to all hosts in order to populate your known hosts

(ssh requires you confirm host public keys the very first time you ssh to a new host)

iirc, that can be automated with ssh-keyscan.



when ssh is fully configured, mpirun should work just fine



Cheers,



Gilles


On Friday, June 3, 2016, Ping Wang < <mailto:***@asc-s.de> ***@asc-s.de> wrote:

Hi,



thank you Gilles for your suggestion. I tried: mpirun --prefix <path to Open MPI> --host <public IP> hostname, then it works.

I’m sure both IPs are the ones of the VM on which mpirun is running, and they are unique.



I also configured Open MPI with --enable-mpirun-prefix-by-default, but I still need to add --prefix <path to Open MPI> to get mpirun work.

I used: ./configure --enable-mpirun-prefix-by-default ="<path to Open MPI> "
make
make install

Did I miss something or I misunderstood the way to configure Open MPI?



When I run: ssh < internal/public IP > `which orted`

The output is: Warning: Permanently added < internal/public IP > ' (ECDSA) to the list of known hosts.
/usr/local/bin/orted

Is it all right?



Cheers,

Ping





Von: users [mailto: <javascript:_e(%7B%7D,'cvml','users-***@open-mpi.org');> users-***@open-mpi.org] Im Auftrag von Gilles Gouaillardet
Gesendet: Donnerstag, 2. Juni 2016 17:06
An: Open MPI Users
Betreff: Re: [OMPI users] Firewall settings for MPI communication



are you saying both IP are the ones of the VM on which mpirun is running ?

orted is only launched on all the machines *except* the one running mpirun.



can you double/triple check the IPs are ok and unique ?

for example, mpirun --host <internal IP> /sbin/ifconfig -a

can you also make sure Open MPI is installed on all your VMs in the same directory ?

also make sure Open MPI has all the dependencies on all the VMs

ssh xxx ldd `which orted`

should show no missing dependency



generally speaking, I recommend you configure Open MPI with

--enable-mpirun-prefix-by-default



you can also try to replace

mpirun

with

`which mpirun`

or

mpirun --prefix <path to Open MPI>



Cheers,



Gilles

On Thursday, June 2, 2016, Ping Wang < <javascript:_e(%7B%7D,'cvml','***@asc-s.de');> ***@asc-s.de> wrote:

Hi,

I've installed Open MPI v1.10.2. Every VM on the cloud has two IPs (internal IP, public IP).
When I run: mpirun --host <internal IP> hostname, the output is the hostname of the VM.
But when I run: mpirun --host <public IP> hostname, the output is

bash: orted: command not found
--------------------------------------------------------------------------
ORTE was unable to reliably start one or more daemons.
This usually is caused by:

* not finding the required libraries and/or binaries on
one or more nodes. Please check your PATH and LD_LIBRARY_PATH
settings, or configure OMPI with --enable-orterun-prefix-by-default

* lack of authority to execute on one or more specified nodes.
Please verify your allocation and authorities.

* the inability to write startup files into /tmp (--tmpdir/orte_tmpdir_base).
Please check with your sys admin to determine the correct location to use.

* compilation of the orted with dynamic libraries when static are required
(e.g., on Cray). Please check your configure cmd line and consider using
one of the contrib/platform definitions for your system type.

* an inability to create a connection back to mpirun due to a
lack of common network interfaces and/or no route found between
them. Please check network connectivity (including firewalls
and network routing requirements).

Both IPs are the IP of the VM where MPI is running. Did I do something wrong in the configuration?

Thanks for any help.

Ping

-----UrsprÃŒngliche Nachricht-----
Von: users [mailto:users-***@open-mpi.org] Im Auftrag von Jeff Squyres (jsquyres)
Gesendet: Mittwoch, 1. Juni 2016 15:02
An: Open MPI User's List
Betreff: Re: [OMPI users] Firewall settings for MPI communication

In addition, you might want to consider upgrading to Open MPI v1.10.x (v1.6.x is fairly ancient).
Post by Gilles Gouaillardet
which network are your VMs using for communications ?
if this is tcp, then you also have to specify a restricted set of
allowed ports for the tcp btl
that would be something like
mpirun --mca btl_tcp_dynamic_ports 49990-50010 ...
please double check the Open MPI 1.6.5 parameter and syntax with
ompi_info --all (or check the archives, I think I posted the correct
command line a few weeks ago)
Cheers,
Gilles
I'm using Open MPI 1.6.5 to run OpenFOAM in parallel on several VMs on
a cloud. mpirun hangs without any error messages. I think this is a
firewall issue. Because when I open all the TCP ports(1-65535) in the
security group of VMs, mpirun works well. However I was suggested to
open as less ports as possible. So I have to limit MPI to run on a
range of ports. I opened the port range 49990-50010 for MPI
communication. And use command
mpirun --mca oob_tcp_dynamic_ports 49990-50010 -np 4 --hostfile machines simpleFoam –parallel.
But it still hangs. How can I specify a port range that OpenMPI will use? I appreciate any help you can provide.
Best,
Ping Wang
<image001.png>
------------------------------------------------------
Ping Wang
Automotive Simulation Center Stuttgart e.V.
Nobelstraße 15
D-70569 Stuttgart
Telefon: +49 711 699659-14
Fax: +49 711 699659-29
Web: <http://www.asc-s.de> http://www.asc-s.de
Social Media: <image002.gif>/asc.stuttgart
------------------------------------------------------
_______________________________________________
users mailing list
Subscription: <https://www.open-mpi.org/mailman/listinfo.cgi/users> https://www.open-mpi.org/mailman/listinfo.cgi/users
<http://www.open-mpi.org/community/lists/users/2016/06/29340.php> http://www.open-mpi.org/community/lists/users/2016/06/29340.php
--
Jeff Squyres
<mailto:***@cisco.com> ***@cisco.com
For corporate legal information go to: <http://www.cisco.com/web/about/doing_business/legal/cri/> http://www.cisco.com/web/about/doing_business/legal/cri/

_______________________________________________
users mailing list
<mailto:***@open-mpi.org> ***@open-mpi.org
Subscription: <https://www.open-mpi.org/mailman/listinfo.cgi/users> https://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: <http://www.open-mpi.org/community/lists/users/2016/06/29342.php> http://www.open-mpi.org/community/lists/users/2016/06/29342.php
Gilles Gouaillardet
2016-06-07 11:18:13 UTC
Permalink
You might want to specify a wider range of ports.
depending on how the socket is closed, a given port might or might not be
available right after a job completes. iirc, and with default TCP settings,
the worst case is a few minutes.
I will double check sockets are created with SO_REUSE (or something like
that), since this might help mitigate this kind of issues.

Cheers,

Gilles
Post by Ping Wang
Hello all,
after the correct configuration, mpirun (v 1.10.2) works fine when all tpc
ports are open. I can ssh to all hosts without a password.
Then it comes back to my first question: how to specify the ports for MPI communication?
mpirun --mca btl_tcp_port_min_v4 40040 --mca btl_tcp_port_range_v4 10
--mca oob_tcp_static_ipv4_ports 40020 --host <IP1>,<IP2> hostname
it works, but not every time. Same as when I run mpirun --mca
oob_tcp_static_ipv4_ports 40020 --host <IP1>,<IP2> hostname
It is strange that sometimes I can get outputs, sometimes it just hangs.
Did I miss something?
Best,
Ping
von *Gilles Gouaillardet
*Gesendet:* Freitag, 3. Juni 2016 00:14
*An:* Open MPI Users
*Betreff:* Re: [OMPI users] Firewall settings for MPI communication
The syntax is
configure --enable-mpirun-prefix-by-default --prefix=<path to OpenMPI> ...
all hosts must be able to ssh each other passwordless.
that means you need to generate a user ssh key pair on all hosts, add your
public keys to the list of authorized keys, and ssh to all hosts in order
to populate your known hosts
(ssh requires you confirm host public keys the very first time you ssh to a new host)
iirc, that can be automated with ssh-keyscan.
when ssh is fully configured, mpirun should work just fine
Cheers,
Gilles
Hi,
thank you Gilles for your suggestion. I tried: mpirun --prefix <path to
Open MPI> --host <public IP> hostname, then it works.
I’m sure both IPs are the ones of the VM on which mpirun is running, and
they are unique.
I also configured Open MPI with --enable-mpirun-prefix-by-default, but I
still need to add --prefix <path to Open MPI> to get mpirun work.
I used: ./configure --enable-mpirun-prefix-by-default ="<path to Open MPI> "
make
make install
Did I miss something or I misunderstood the way to configure Open MPI?
When I run: ssh < internal/public IP > `which orted`
The output is: Warning: Permanently added < internal/public IP > '
(ECDSA) to the list of known hosts.
/usr/local/bin/orted
Is it all right?
Cheers,
Ping
Gouaillardet
*Gesendet:* Donnerstag, 2. Juni 2016 17:06
*An:* Open MPI Users
*Betreff:* Re: [OMPI users] Firewall settings for MPI communication
are you saying both IP are the ones of the VM on which mpirun is running ?
orted is only launched on all the machines *except* the one running mpirun.
can you double/triple check the IPs are ok and unique ?
for example, mpirun --host <internal IP> /sbin/ifconfig -a
can you also make sure Open MPI is installed on all your VMs in the same directory ?
also make sure Open MPI has all the dependencies on all the VMs
ssh xxx ldd `which orted`
should show no missing dependency
generally speaking, I recommend you configure Open MPI with
--enable-mpirun-prefix-by-default
you can also try to replace
mpirun
with
`which mpirun`
or
mpirun --prefix <path to Open MPI>
Cheers,
Gilles
Hi,
I've installed Open MPI v1.10.2. Every VM on the cloud has two IPs
(internal IP, public IP).
When I run: mpirun --host <internal IP> hostname, the output is the hostname of the VM.
But when I run: mpirun --host <public IP> hostname, the output is
bash: orted: command not found
--------------------------------------------------------------------------
ORTE was unable to reliably start one or more daemons.
* not finding the required libraries and/or binaries on
one or more nodes. Please check your PATH and LD_LIBRARY_PATH
settings, or configure OMPI with --enable-orterun-prefix-by-default
* lack of authority to execute on one or more specified nodes.
Please verify your allocation and authorities.
* the inability to write startup files into /tmp
(--tmpdir/orte_tmpdir_base).
Please check with your sys admin to determine the correct location to use.
* compilation of the orted with dynamic libraries when static are required
(e.g., on Cray). Please check your configure cmd line and consider using
one of the contrib/platform definitions for your system type.
* an inability to create a connection back to mpirun due to a
lack of common network interfaces and/or no route found between
them. Please check network connectivity (including firewalls
and network routing requirements).
Both IPs are the IP of the VM where MPI is running. Did I do something
wrong in the configuration?
Thanks for any help.
Ping
-----UrsprÃŒngliche Nachricht-----
von Jeff Squyres (jsquyres)
Gesendet: Mittwoch, 1. Juni 2016 15:02
An: Open MPI User's List
Betreff: Re: [OMPI users] Firewall settings for MPI communication
In addition, you might want to consider upgrading to Open MPI v1.10.x
(v1.6.x is fairly ancient).
On Jun 1, 2016, at 7:46 AM, Gilles Gouaillardet <
which network are your VMs using for communications ?
if this is tcp, then you also have to specify a restricted set of
allowed ports for the tcp btl
that would be something like
mpirun --mca btl_tcp_dynamic_ports 49990-50010 ...
please double check the Open MPI 1.6.5 parameter and syntax with
ompi_info --all (or check the archives, I think I posted the correct
command line a few weeks ago)
Cheers,
Gilles
I'm using Open MPI 1.6.5 to run OpenFOAM in parallel on several VMs on
a cloud. mpirun hangs without any error messages. I think this is a
firewall issue. Because when I open all the TCP ports(1-65535) in the
security group of VMs, mpirun works well. However I was suggested to
open as less ports as possible. So I have to limit MPI to run on a
range of ports. I opened the port range 49990-50010 for MPI
communication. And use command
mpirun --mca oob_tcp_dynamic_ports 49990-50010 -np 4 --hostfile machines
simpleFoam –parallel.
But it still hangs. How can I specify a port range that OpenMPI will
use? I appreciate any help you can provide.
Best,
Ping Wang
<image001.png>
------------------------------------------------------
Ping Wang
Automotive Simulation Center Stuttgart e.V.
Nobelstraße 15
D-70569 Stuttgart
Telefon: +49 711 699659-14
Fax: +49 711 699659-29
Web: http://www.asc-s.de
Social Media: <image002.gif>/asc.stuttgart
------------------------------------------------------
_______________________________________________
users mailing list
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
http://www.open-mpi.org/community/lists/users/2016/06/29340.php
--
Jeff Squyres
http://www.cisco.com/web/about/doing_business/legal/cri/
_______________________________________________
users mailing list
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
http://www.open-mpi.org/community/lists/users/2016/06/29342.php
_______________________________________________
users mailing list
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
http://www.open-mpi.org/community/lists/users/2016/06/29349.php
Loading...