[OMPI users] Enforce TCP with mpirun
Maksym Planeta
2017-08-16 11:57:52 UTC

I work with an Infiniband cluster, but I want to force OpenMPI to use
specific network interface.

I tried to do this for example using mpirun as follows:

mpirun --map-by node --mca btl self,tcp -np 16 bin/is.C.16

But counters returned by /usr/sbin/perfquery still keep showing that
transmission happens over ibverbs.

There is an iboib module on a machine, but counters there (ip -s link)
indicate that the ib0 interface is not used.

Could you help me to figure out how to properly tell OpenMPI not to use

I tried with Open MPI 2.1.0 and 1.10.2, but saw no difference in behavior.
Maksym Planeta
Gilles Gouaillardet
2017-08-16 12:34:51 UTC

you can try
mpirun --mca pml ob1 --mca btl tcp,self ...

pml/cm has a higher priority than pml/ob1, so if you have a mtl that
fits your network (such as mtl/mxm),
then pml/ob1 will be ignored, and the list of allowed/excluded btl
become insignificant.



On Wed, Aug 16, 2017 at 8:57 PM, Maksym Planeta
Post by Maksym Planeta
I work with an Infiniband cluster, but I want to force OpenMPI to use
specific network interface.
mpirun --map-by node --mca btl self,tcp -np 16 bin/is.C.16
But counters returned by /usr/sbin/perfquery still keep showing that
transmission happens over ibverbs.
There is an iboib module on a machine, but counters there (ip -s link)
indicate that the ib0 interface is not used.
Could you help me to figure out how to properly tell OpenMPI not to use
I tried with Open MPI 2.1.0 and 1.10.2, but saw no difference in behavior.
Maksym Planeta
users mailing list
Maksym Planeta
2017-08-16 13:54:29 UTC
Dear Gilles,

thank you for quick response.

pml/cm doesn't work at all

When I use "--mca pml ob1" I still see traffic in /usr/sbin/perfquery, but the program starts running a lot slower. E. g. ib.C.64 benchmarks runs 33 seconds in contrast to less than 1.

I also see many if following warning messages when I use PML ob1:

[<hostname>:25220] mca_base_component_repository_open: unable to open mca_coll_hcoll: libmxm.so.2: cannot open shared object file: No such file or directory (ignored)

But at least I see traffic on ib0 interface. So I basically achieved the goal from the original mail.

Nevertheless I wanted to try avoid IB completely, so I added "--mca btl_tcp_if_exclude ib0". But no program cannot run and fails with these kinds of messages:

[<hostname>][[42657,1],23][btl_tcp_endpoint.c:649:mca_btl_tcp_endpoint_recv_connect_ack] received unexpected process identifier [[42657,1],1]

I see the same message when I also use this flag "--mca btl_tcp_if_include lo,ib0"

Here is full command:

$ mpirun --mca pml ob1 --mca btl self,tcp --mca btl_tcp_if_include lo,ib0 -np 64 bin/is.C.64

I also tried to use "--mca btl_tcp_if_include lo,eth0", but MPI started complaining with this message:

[<hostname>][[40797,1],58][btl_tcp_component.c:706:mca_btl_tcp_component_create_instances] invalid interface "eth0"

I do have this interface:

$ ip addr show eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
link/ether 08:00:38:3c:4e:65 brd ff:ff:ff:ff:ff:ff
inet brd scope global eth0
inet6 fe80::a00:38ff:fe3c:4e65/64 scope link
valid_lft forever preferred_lft forever

Can I tell somehow MPI to use eth0, and not use ib0?
Post by Gilles Gouaillardet
you can try
mpirun --mca pml ob1 --mca btl tcp,self ...
pml/cm has a higher priority than pml/ob1, so if you have a mtl that
fits your network (such as mtl/mxm),
then pml/ob1 will be ignored, and the list of allowed/excluded btl
become insignificant.
On Wed, Aug 16, 2017 at 8:57 PM, Maksym Planeta
Post by Maksym Planeta
I work with an Infiniband cluster, but I want to force OpenMPI to use
specific network interface.
mpirun --map-by node --mca btl self,tcp -np 16 bin/is.C.16
But counters returned by /usr/sbin/perfquery still keep showing that
transmission happens over ibverbs.
There is an iboib module on a machine, but counters there (ip -s link)
indicate that the ib0 interface is not used.
Could you help me to figure out how to properly tell OpenMPI not to use
I tried with Open MPI 2.1.0 and 1.10.2, but saw no difference in behavior.
Maksym Planeta
users mailing list
users mailing list
Maksym Planeta
Gilles Gouaillardet
2017-08-16 14:05:39 UTC
My bad, i forgot btl/tcp is using all the interfaces by default (eth0 *and* ib0)

is eth0 available on all your nodes or just the node running mpirun ?

you can try to use a subnet instead of an interface name
mpirun --mca btl_tcp_if_include ...

if you are still facing some issues, you can
mpirun ... --mca btl_base_verbose 100 ...
in order to collect (lot of) logs



On Wed, Aug 16, 2017 at 10:54 PM, Maksym Planeta
Post by Maksym Planeta
Dear Gilles,
thank you for quick response.
pml/cm doesn't work at all
When I use "--mca pml ob1" I still see traffic in /usr/sbin/perfquery, but the program starts running a lot slower. E. g. ib.C.64 benchmarks runs 33 seconds in contrast to less than 1.
[<hostname>:25220] mca_base_component_repository_open: unable to open mca_coll_hcoll: libmxm.so.2: cannot open shared object file: No such file or directory (ignored)
But at least I see traffic on ib0 interface. So I basically achieved the goal from the original mail.
[<hostname>][[42657,1],23][btl_tcp_endpoint.c:649:mca_btl_tcp_endpoint_recv_connect_ack] received unexpected process identifier [[42657,1],1]
I see the same message when I also use this flag "--mca btl_tcp_if_include lo,ib0"
$ mpirun --mca pml ob1 --mca btl self,tcp --mca btl_tcp_if_include lo,ib0 -np 64 bin/is.C.64
[<hostname>][[40797,1],58][btl_tcp_component.c:706:mca_btl_tcp_component_create_instances] invalid interface "eth0"
$ ip addr show eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
link/ether 08:00:38:3c:4e:65 brd ff:ff:ff:ff:ff:ff
inet brd scope global eth0
inet6 fe80::a00:38ff:fe3c:4e65/64 scope link
valid_lft forever preferred_lft forever
Can I tell somehow MPI to use eth0, and not use ib0?
Post by Gilles Gouaillardet
you can try
mpirun --mca pml ob1 --mca btl tcp,self ...
pml/cm has a higher priority than pml/ob1, so if you have a mtl that
fits your network (such as mtl/mxm),
then pml/ob1 will be ignored, and the list of allowed/excluded btl
become insignificant.
On Wed, Aug 16, 2017 at 8:57 PM, Maksym Planeta
Post by Maksym Planeta
I work with an Infiniband cluster, but I want to force OpenMPI to use
specific network interface.
mpirun --map-by node --mca btl self,tcp -np 16 bin/is.C.16
But counters returned by /usr/sbin/perfquery still keep showing that
transmission happens over ibverbs.
There is an iboib module on a machine, but counters there (ip -s link)
indicate that the ib0 interface is not used.
Could you help me to figure out how to properly tell OpenMPI not to use
I tried with Open MPI 2.1.0 and 1.10.2, but saw no difference in behavior.
Maksym Planeta
users mailing list
users mailing list
Maksym Planeta
users mailing list
Maksym Planeta
2017-08-17 15:59:34 UTC
Thanks, it worked out.
Post by Gilles Gouaillardet
My bad, i forgot btl/tcp is using all the interfaces by default (eth0 *and* ib0)
is eth0 available on all your nodes or just the node running mpirun ?
you can try to use a subnet instead of an interface name
mpirun --mca btl_tcp_if_include ...
if you are still facing some issues, you can
mpirun ... --mca btl_base_verbose 100 ...
in order to collect (lot of) logs
On Wed, Aug 16, 2017 at 10:54 PM, Maksym Planeta
Post by Maksym Planeta
Dear Gilles,
thank you for quick response.
pml/cm doesn't work at all
When I use "--mca pml ob1" I still see traffic in /usr/sbin/perfquery, but the program starts running a lot slower. E. g. ib.C.64 benchmarks runs 33 seconds in contrast to less than 1.
[<hostname>:25220] mca_base_component_repository_open: unable to open mca_coll_hcoll: libmxm.so.2: cannot open shared object file: No such file or directory (ignored)
But at least I see traffic on ib0 interface. So I basically achieved the goal from the original mail.
[<hostname>][[42657,1],23][btl_tcp_endpoint.c:649:mca_btl_tcp_endpoint_recv_connect_ack] received unexpected process identifier [[42657,1],1]
I see the same message when I also use this flag "--mca btl_tcp_if_include lo,ib0"
$ mpirun --mca pml ob1 --mca btl self,tcp --mca btl_tcp_if_include lo,ib0 -np 64 bin/is.C.64
[<hostname>][[40797,1],58][btl_tcp_component.c:706:mca_btl_tcp_component_create_instances] invalid interface "eth0"
$ ip addr show eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
link/ether 08:00:38:3c:4e:65 brd ff:ff:ff:ff:ff:ff
inet brd scope global eth0
inet6 fe80::a00:38ff:fe3c:4e65/64 scope link
valid_lft forever preferred_lft forever
Can I tell somehow MPI to use eth0, and not use ib0?
Post by Gilles Gouaillardet
you can try
mpirun --mca pml ob1 --mca btl tcp,self ...
pml/cm has a higher priority than pml/ob1, so if you have a mtl that
fits your network (such as mtl/mxm),
then pml/ob1 will be ignored, and the list of allowed/excluded btl
become insignificant.
On Wed, Aug 16, 2017 at 8:57 PM, Maksym Planeta
Post by Maksym Planeta
I work with an Infiniband cluster, but I want to force OpenMPI to use
specific network interface.
mpirun --map-by node --mca btl self,tcp -np 16 bin/is.C.16
But counters returned by /usr/sbin/perfquery still keep showing that
transmission happens over ibverbs.
There is an iboib module on a machine, but counters there (ip -s link)
indicate that the ib0 interface is not used.
Could you help me to figure out how to properly tell OpenMPI not to use
I tried with Open MPI 2.1.0 and 1.10.2, but saw no difference in behavior.
Maksym Planeta
users mailing list
users mailing list
Maksym Planeta
users mailing list
users mailing list
Maksym Planeta