Discussion:
[OMPI users] Using openmpi on multiple computers
Mahdi, Sam
2017-02-14 05:51:59 UTC
Permalink
Hello everyone,

I am attempting to run a program on multiple computers, I am currently
testing it out on 2. I have already created a ssh register key between the
computer I want to connect to and my computer. I've also already created a
hosts file on my computer with the IP of the computer I want to connect to.
When I attempt to connect though through the mpirun -hosts command or
mpirun -f hosts I receive this error
[proxy:0:***@localhost.localdomain] HYDU_sock_connect
(./utils/sock/sock.c:174): unable to connect from "localhost.localdomain"
to "localhost.localdomain" (Connection refused)
[proxy:0:***@localhost.localdomain] main (./pm/pmiserv/pmip.c:189): unable to
connect to server localhost.localdomain at port 57437 (check for firewalls!)

It appears that "localhost.localdomain" is attempting to connect to another
"localhost.localdomain". How do I make it so my computer via my IP is
connecting to another computer via their IP?

I have tried creating a host file with the IP of the computer and then
using the mpiexec -f hosts -n 4 ./app1 command, got the same error output.

I have also tried directly typing in the IP using this command mpirun -np 3
--host a,b,c hostname where I typed in the IP of the other computer in A.
Still got the same error.
Gilles Gouaillardet
2017-02-14 05:57:56 UTC
Permalink
Hi,


this error message does not come from Open MPI, but from MPICH or one of
its derivative (such as Intel MPI)


you might want to check your environment (PATH, LD_LIBRARY_PATH), and
make sure Open MPI is used, even if SSH is used.

for example, you can


ldd a.out

ssh node ldd a.out


there should be no reference non Open MPI MPI library


Cheers,


Gilles
Post by Mahdi, Sam
Hello everyone,
I am attempting to run a program on multiple computers, I am currently
testing it out on 2. I have already created a ssh register key between
the computer I want to connect to and my computer. I've also already
created a hosts file on my computer with the IP of the computer I want
to connect to. When I attempt to connect though through the mpirun
-hosts command or mpirun -f hosts I receive this error
(./utils/sock/sock.c:174): unable to connect from
"localhost.localdomain" to "localhost.localdomain" (Connection refused)
unable to connect to server localhost.localdomain at port 57437 (check
for firewalls!)
It appears that "localhost.localdomain" is attempting to connect to
another "localhost.localdomain". How do I make it so my computer via
my IP is connecting to another computer via their IP?
I have tried creating a host file with the IP of the computer and then
using the mpiexec -f hosts -n 4 ./app1 command, got the same error output.
I have also tried directly typing in the IP using this command mpirun
-np 3 --host a,b,c hostname where I typed in the IP of the other
computer in A. Still got the same error.
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
Loading...