Discussion:
[OMPI users] OpenFabrics warning
Andrei Berceanu
2018-11-12 13:05:44 UTC
Permalink
Hi all,

Running a CUDA+MPI application on a node with 2 K80 GPUs, I get the
following warnings:

--------------------------------------------------------------------------
WARNING: There is at least non-excluded one OpenFabrics device found,
but there are no active ports detected (or Open MPI was unable to use
them). This is most certainly not what you wanted. Check your
cables, subnet manager configuration, etc. The openib BTL will be
ignored for this job.

Local host: gpu01
--------------------------------------------------------------------------
[gpu01:107262] 1 more process has sent help message help-mpi-btl-openib.txt
/ no active ports found
[gpu01:107262] Set MCA parameter "orte_base_help_aggregate" to 0 to see all
help / error messages

Any idea of what is going on and how I can fix this?
I am using OpenMPI 3.1.2.

Best,
Andrei
Michael Di Domenico
2018-11-12 14:28:54 UTC
Permalink
On Mon, Nov 12, 2018 at 8:08 AM Andrei Berceanu
Post by Andrei Berceanu
--------------------------------------------------------------------------
WARNING: There is at least non-excluded one OpenFabrics device found,
but there are no active ports detected (or Open MPI was unable to use
them). This is most certainly not what you wanted. Check your
cables, subnet manager configuration, etc. The openib BTL will be
ignored for this job.
Local host: gpu01
--------------------------------------------------------------------------
[gpu01:107262] 1 more process has sent help message help-mpi-btl-openib.txt / no active ports found
[gpu01:107262] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
Any idea of what is going on and how I can fix this?
I am using OpenMPI 3.1.2.
looks like openmpi found something like an infiniband card in the
compute node you're using, but it is not active/usable

as for a fix, it depends.

if you have an IB card should it be active? if so, you'd have to
check the connections to see why it's disabled

if not, you'll can tell openmpi to disregard the IB ports, which will
clear the warning, but that might mean you're potentially using a
slower interface for message passing
Andrei Berceanu
2018-11-12 15:49:29 UTC
Permalink
The node has an IB card, but it is a stand-alone node, disconnected from
the rest of the cluster.
I am using OMPI to communicate internally between the GPUs of this node
(and not between nodes).
So how can I disable the IB?
Gilles Gouaillardet
2018-11-12 16:31:33 UTC
Permalink
Andrei,

you can

mpirun --mca btl ^openib ...

in order to "disable" infiniband


Cheers,

Gilles
On Mon, Nov 12, 2018 at 9:52 AM Andrei Berceanu
The node has an IB card, but it is a stand-alone node, disconnected from the rest of the cluster.
I am using OMPI to communicate internally between the GPUs of this node (and not between nodes).
So how can I disable the IB?
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Andrei Berceanu
2018-11-12 17:04:56 UTC
Permalink
Problem solved, thank you!

Best,
Andrei

On Mon, Nov 12, 2018 at 6:33 PM Gilles Gouaillardet <
Post by Gilles Gouaillardet
Andrei,
you can
mpirun --mca btl ^openib ...
in order to "disable" infiniband
Cheers,
Gilles
On Mon, Nov 12, 2018 at 9:52 AM Andrei Berceanu
Post by Andrei Berceanu
The node has an IB card, but it is a stand-alone node, disconnected from
the rest of the cluster.
Post by Andrei Berceanu
I am using OMPI to communicate internally between the GPUs of this node
(and not between nodes).
Post by Andrei Berceanu
So how can I disable the IB?
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Loading...