If I might interject here before lots of time is wasted. Spectrum MPI is an IBM -product- and is not free. What you are likely running into is that their license manager is blocking you from running, albeit without a really nice error message. Iâm sure thatâs something they are working on.
If you really want to use Spectrum MPI, I suggest you contact them about purchasing it.
Post by Gilles Gouaillardetmpirun --mca btl_base_verbose 100 -np 2 ...
Gabriele,
can you
mpirun --mca btl_base_verbose 100 -np 2 ...
so we can figure out why nor sm nor vader is used ?
Cheers,
Gilles
findActiveDevices Error
We found no active IB device ports
findActiveDevices Error
We found no active IB device ports
--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications. This means that no Open MPI device has indicated
that it can be used to communicate between these processes. This is
an error; Open MPI requires that all MPI processes be able to reach
each other. This error can sometimes be the result of forgetting to
specify the "self" BTL.
Process 1 ([[12380,1],0]) is on host: openpower
Process 2 ([[12380,1],1]) is on host: openpower
BTLs attempted: self
Your MPI job is now going to abort; sorry.
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
--------------------------------------------------------------------------
MPI_INIT has failed because at least one MPI process is unreachable
from another. This *usually* means that an underlying communication
plugin -- such as a BTL or an MTL -- has either not loaded or not
allowed itself to be used. Your MPI job will now abort.
You may wish to try to narrow down the problem;
* Check the output of ompi_info to see which BTL/MTL plugins are
available.
* Run your application with MPI_THREAD_SINGLE.
* Set the MCA parameter btl_base_verbose to 100 (or mtl_base_verbose,
if using MTL-based communications) to see exactly which
communication plugins were considered and/or discarded.
--------------------------------------------------------------------------
[openpower:88867] 1 more process has sent help message help-mca-bml-r2.txt / unreachable proc
[openpower:88867] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
[openpower:88867] 1 more process has sent help message help-mpi-runtime.txt / mpi_init:startup:pml-add-procs-fail
Hi GIlles,
findActiveDevices Error
We found no active IB device ports
Hello world from rank 0 out of 1 processors
So it seems to work apart the error message.
Gabriele,
so it seems pml/pami assumes there is an infiniband card available (!)
i guess IBM folks will comment on that shortly.
meanwhile, you do not need pami since you are running on a single node
mpirun --mca pml ^pami ...
should do the trick
(if it does not work, can run and post the logs)
mpirun --mca pml ^pami --mca pml_base_verbose 100 ...
Cheers,
Gilles
Hi John,
Infiniband is not used, there is a single node on this machine.
2017-05-19 8:50 GMT+02:00 John Hearns via users
Gabriele, pleae run 'ibv_devinfo'
It looks to me like you may have the physical
interface cards in
these systems, but you do not have the correct drivers or
libraries loaded.
I have had similar messages when using Infiniband on
x86 systems -
which did not have libibverbs installed.
On 19 May 2017 at 08:41, Gabriele Fatigati
registering
framework pml components
found loaded
component pami
[openpower:88536] mca: base: components_register: component
pami register function successful
opening pml
components
[openpower:88536] mca: base: components_open: found loaded
component pami
component pami
open function successful
[openpower:88536] select: initializing pml
component pami
findActiveDevices Error
We found no active IB device ports
[openpower:88536] select: init returned failure
for component pami
[openpower:88536] PML pami cannot be selected
--------------------------------------------------------------------------
No components were able to be opened in the pml framework.
This typically means that either no components of
this type were
installed, or none of the installed componnets can
be loaded.
Sometimes this means that shared libraries
required by these
components are unable to be found/loaded.
Host: openpower
Framework: pml
--------------------------------------------------------------------------
2017-05-19 7:03 GMT+02:00 Gilles Gouaillardet
Gabriele,
pml/pami is here, at least according to ompi_info
can you update your mpirun command like this
mpirun --mca pml_base_verbose 100 ..
and post the output ?
Cheers,
Gilles
Hi Gilles, attached the requested info
2017-05-18 15:04 GMT+02:00 Gilles Gouaillardet
Gabriele,
can you
ompi_info --all | grep pml
also, make sure there is nothing in your
environment pointing to
an other Open MPI install
for example
ldd a.out
should only point to IBM libraries
Cheers,
Gilles
On Thursday, May 18, 2017, Gabriele Fatigati
Dear OpenMPI users and developers,
I'm using
IBM Spectrum MPI
10.1.0 based on OpenMPI, so I hope
there are
some MPI expert
can help me to solve the problem.
When I run a simple Hello World
MPI program, I
get the follow
A requested component was not found, or was
unable to be
opened. This
means that this component is either not
installed or is unable
to be
used on your system (e.g.,
sometimes this
means that shared
libraries
that the component requires are
unable to be
found/loaded). Note that
Open MPI stopped checking at the first
component that it did
not find.
Host: openpower
Framework: pml
Component: pami
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for
some reason;
your parallel
process is
likely to abort. There are many
reasons that a
parallel
process can
fail during MPI_INIT; some of
which are due to
configuration
or environment
problems. This failure appears to be an
internal failure;
here's some
additional information (which may only be
relevant to an Open MPI
mca_pml_base_open() failed
--> Returned "Not found" (-13) instead of
"Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL
(processes in this
communicator will
now abort,
*** and potentially your MPI job)
My sysadmin used official IBM Spectrum
packages to install
MPI, so It's quite strange that
there are some
components
missing (pami). Any help? Thanks
-- Ing. Gabriele Fatigati
HPC specialist
SuperComputing Applications and Innovation
Department
Via Magnanelli 6/3, Casalecchio di
Reno (BO) Italy
www.cineca.it <http://www.cineca.it/> <http://www.cineca.it <http://www.cineca.it/>> <http://www.cineca.it <http://www.cineca.it/>>
<http://www.cineca.it <http://www.cineca.it/>> Tel: +39
051 6171722 <tel:051%206171722> <tel:051%206171722> <tel:051%206171722>
<tel:051%20617%201722>
g.fatigati [AT] cineca.it <http://cineca.it/>
<http://cineca.it <http://cineca.it/>> <http://cineca.it <http://cineca.it/>>
<http://cineca.it <http://cineca.it/>>
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
<https://rfd.newmexicoconsortium.org/mailman/listinfo/users <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>>
<https://rfd.newmexicoconsortium.org/mailman/listinfo/users <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
<https://rfd.newmexicoconsortium.org/mailman/listinfo/users <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>>>
<https://rfd.newmexicoconsortium.org/mailman/listinfo/users <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
<https://rfd.newmexicoconsortium.org/mailman/listinfo/users <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>>
<https://rfd.newmexicoconsortium.org/mailman/listinfo/users <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
<https://rfd.newmexicoconsortium.org/mailman/listinfo/users <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>>>>
-- Ing. Gabriele Fatigati
HPC specialist
SuperComputing Applications and Innovation
Department
Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
www.cineca.it <http://www.cineca.it/> <http://www.cineca.it <http://www.cineca.it/>> <http://www.cineca.it <http://www.cineca.it/>>
<http://www.cineca.it <http://www.cineca.it/>> Tel: +39 051
6171722 <tel:%2B39%20051%206171722>
<tel:%2B39%20051%206171722>
g.fatigati [AT] cineca.it <http://cineca.it/>
<http://cineca.it <http://cineca.it/>> <http://cineca.it <http://cineca.it/>>
<http://cineca.it <http://cineca.it/>>
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
<https://rfd.newmexicoconsortium.org/mailman/listinfo/users <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>>
<https://rfd.newmexicoconsortium.org/mailman/listinfo/users <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
<https://rfd.newmexicoconsortium.org/mailman/listinfo/users <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>>>
-- Ing. Gabriele Fatigati
HPC specialist
SuperComputing Applications and Innovation Department
Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
www.cineca.it <http://www.cineca.it/> <http://www.cineca.it <http://www.cineca.it/>>
+39 051 6171722 <tel:%2B39%20051%206171722> <tel:%2B39%20051%206171722>
<tel:+39%20051%20617%201722>
g.fatigati [AT] cineca.it <http://cineca.it/> <http://cineca.it <http://cineca.it/>>
<http://cineca.it <http://cineca.it/>>
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
<https://rfd.newmexicoconsortium.org/mailman/listinfo/users <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>>
<https://rfd.newmexicoconsortium.org/mailman/listinfo/users <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
<https://rfd.newmexicoconsortium.org/mailman/listinfo/users <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>>>
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
<https://rfd.newmexicoconsortium.org/mailman/listinfo/users <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>>
<https://rfd.newmexicoconsortium.org/mailman/listinfo/users <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
<https://rfd.newmexicoconsortium.org/mailman/listinfo/users <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>>>
-- Ing. Gabriele Fatigati
HPC specialist
SuperComputing Applications and Innovation Department
Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
www.cineca.it <http://www.cineca.it/> <http://www.cineca.it <http://www.cineca.it/>>
<http://www.cineca.it <http://www.cineca.it/>> Tel: +39 051 6171722 <tel:%2B39%20051%206171722>
<tel:%2B39%20051%206171722>
g.fatigati [AT] cineca.it <http://cineca.it/> <http://cineca.it <http://cineca.it/>>
<http://cineca.it <http://cineca.it/>>
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
<https://rfd.newmexicoconsortium.org/mailman/listinfo/users <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>>
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
<https://rfd.newmexicoconsortium.org/mailman/listinfo/users <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>>
-- Ing. Gabriele Fatigati
HPC specialist
SuperComputing Applications and Innovation Department
Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
www.cineca.it <http://www.cineca.it/> <http://www.cineca.it <http://www.cineca.it/>> Tel: +39 051 6171722 <tel:%2B39%20051%206171722>
<tel:051%20617%201722>
g.fatigati [AT] cineca.it <http://cineca.it/> <http://cineca.it <http://cineca.it/>>
--
Ing. Gabriele Fatigati
HPC specialist
SuperComputing Applications and Innovation Department
Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
www.cineca.it <http://www.cineca.it/> <http://www.cineca.it <http://www.cineca.it/>> Tel: +39 051 6171722 <tel:%2B39%20051%206171722>
g.fatigati [AT] cineca.it <http://cineca.it/> <http://cineca.it <http://cineca.it/>>
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
--
Ing. Gabriele Fatigati
HPC specialist
SuperComputing Applications and Innovation Department
Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
www.cineca.it <http://www.cineca.it/> Tel: +39 051 6171722
g.fatigati [AT] cineca.it <http://cineca.it/>
<output_mpirun>_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users