Post by Marlborough, RickGilles;
The abort occurs somewhere between 30 and 60 seconds. Is
there some configuration setting that could influence this?
Rick
Behalf Of *Gilles Gouaillardet
*Sent:* Tuesday, October 04, 2016 8:39 AM
*To:* Open MPI Users
*Subject:* Re: [OMPI users] problems with client server scenario using
MPI_Comm_connect
Rick,
How long does it take between the test fails ?
There were a bug that caused a failure if no connection was received after
2 (3?) seconds, but I think it was fixed in v2.0.1
That being said, you might want to try a nightly snapshot of the v2.0.x branch
Cheers,
Gilles
Gilles;
Here is the client side code. The start command is âmpirun
ân 1 client 10â where 10 is used to size a buffer.
int numtasks, rank, dest, source, rc, count, tag=1;
MPI_Init(&argc,&argv);
if(argc > 1)
{
bufsize = atoi(argv[1]);
}
MPI_Comm_size(MPI_COMM_WORLD, &numtasks);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm server;
if(1)
{
char port_name[MPI_MAX_PORT_NAME + 1];
std::ifstream file("./portfile");
file.getline(port_name,MPI_MAX_PORT_NAME) ;
file.close();
//Lookup_name does not work.
//MPI_Lookup_name("test_service",
MPI_INFO_NULL, port_name);
std::cout << "Established port name is "
<< port_name << std::endl;
MPI_Comm_connect(port_name,
MPI_INFO_NULL, 0, MPI_COMM_WORLD, &server);
MPI_Comm_remote_size(server,&num_procs);
std::cout << "Number of running processes
is " << num_procs << std::endl;
MPI_Finalize();
exit(0);
}
Here is the server code. This is started on a different machine. The
command line is âmpirun ân 1 sendrec 10â where 10 is used to size a buffer.
int numtasks, rank, dest, source, rc, count, tag=1;
MPI_Init(&argc,&argv);
if(argc > 1)
{
bufsize = atoi(argv[1]);
}
MPI_Comm_size(MPI_COMM_WORLD, &numtasks);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm remote_clients;
MPI_Info pub_global;
std::cout << "This process rank is " << rank << std::endl;
std::cout << "Number of current processes is " << numtasks << std::endl;
char port_name[MPI_MAX_PORT_NAME];
mpi_error = MPI_Open_port(MPI_INFO_NULL, port_name);
MPI_Info_create(&pub_global);
MPI_Info_set(pub_global, "ompi_global_scope", "true");
mpi_error = MPI_Publish_name("test_service", pub_global, port_name);
if(mpi_error)
{
...
}
std::cout << "Established port name is " << port_name << std::endl;
std::ofstream file("./portfile",std::ofstream::trunc);
file << port_name;
file.close();
MPI_Comm_accept(port_name, MPI_INFO_NULL, 0,
MPI_COMM_WORLD, &remote_clients);
The server error looks like thisâŠ
The client error look like soâŠ
Thanks
Rick
Gouaillardet
*Sent:* Tuesday, October 04, 2016 7:13 AM
*To:* Open MPI Users
*Subject:* Re: [OMPI users] problems with client server scenario using
MPI_Comm_connect
Rick,
I do not think ompi_server is required here.
Can you please post a trimmed version of your client and server, and your
two mpirun command lines.
You also need to make sure all ranks have the same root parameter when
invoking MPI_Comm_accept and MPI_Comm_connect
Cheers,
Gilles
Folks;
I have been trying to get a test case up and running using
a client server scenario with a server waiting on MPI_Comm_accept and the
client trying to connect via MPI_Comm_connect. The port value is written to
a file. The client opens the file and reads the port value. I run the
server, followed by the client. They both appear to sit there for a time,
but eventually they both timeout and abort. They are both running a
separate machines. All other communications between these 2 machines
appears to be OK. Is there some intermediate service that needs to be run?
I am using OpenMPI v2.01 on Red Hat linux v6.5 64 bit running on a 1 gig
network.
Thanks
Rick