Discussion:
[OMPI users] Persistance message hang when sending and receiving a large data
Quang Ha
2017-07-01 20:22:53 UTC
Permalink
Hi MPI-users,

I am currently facing some troubles with persitance calling. The following
code describe the abstract I am trying to get out of:

I am trying to implement some form of persistent calling. Somehow the
following code keeps hanging - I guessed I must have introduced a deadlock
but can't really wrap my head around it...

MPI_Request r[2];
MPI_Request s[2];
int num_send = 1000;
[...]
MPI_Send_init(&Arr[1][1], num_send, MPI_DOUBLE, 1, A, MPI_COMM_WORLD,
&s[0]);
MPI_Recv_init(&Arr[1][0], num_send, MPI_DOUBLE, 0, A, MPI_COMM_WORLD,
&r[0]);

MPI_Send_init(&Arr[2][1], num_send, MPI_DOUBLE, 0, B, MPI_COMM_WORLD,
&s[1]);
MPI_Recv_init(&Arr[2][0], num_send, MPI_DOUBLE, 1, B, MPI_COMM_WORLD,
&r[1]);
[...]
MPI_Startall(2, r);
MPI_Waitall(2, r, MPI_STATUSES_IGNORE);


This works kinda-fine if num_send is acceptably small. Once I reached
something like 10,000 or 50,000, the code just hang there without changing
anything.

Is this phenomenon kinda expected? May I have some explanation for this
behaviour please?

Many thanks,
Quang
George Bosilca
2017-07-03 14:26:07 UTC
Permalink
Quang,

You do start the persistent send requests as well right ? Assuming this is
the case, that means you will start about 100K requests in one go. One of
the problems might be the perception that the process is stuck, when in
fact it simply progresses extremely slowly. Indeed, the MPI_Startall is a
linear operation on the number of requests, and depending on the order in
which you start them you might see behaviors you might not expect, but that
are normal once you see understand the linear nature of the operation.

If you start first the 50K sends then you will end having 50k unexpected
messages. If on the opposite you start first the receive, you will fail all
the initial matching, and then when the send arrives you will potentially
have to do again a linear search to identify the correct matching receive.

Can you provide a replicator for this issue ?

George.
Post by Quang Ha
Hi MPI-users,
I am currently facing some troubles with persitance calling. The following
I am trying to implement some form of persistent calling. Somehow the
following code keeps hanging - I guessed I must have introduced a deadlock
but can't really wrap my head around it...
MPI_Request r[2];
MPI_Request s[2];
int num_send = 1000;
[...]
MPI_Send_init(&Arr[1][1], num_send, MPI_DOUBLE, 1, A, MPI_COMM_WORLD,
&s[0]);
MPI_Recv_init(&Arr[1][0], num_send, MPI_DOUBLE, 0, A, MPI_COMM_WORLD,
&r[0]);
MPI_Send_init(&Arr[2][1], num_send, MPI_DOUBLE, 0, B, MPI_COMM_WORLD,
&s[1]);
MPI_Recv_init(&Arr[2][0], num_send, MPI_DOUBLE, 1, B, MPI_COMM_WORLD,
&r[1]);
[...]
MPI_Startall(2, r);
MPI_Waitall(2, r, MPI_STATUSES_IGNORE);
This works kinda-fine if num_send is acceptably small. Once I reached
something like 10,000 or 50,000, the code just hang there without changing
anything.
Is this phenomenon kinda expected? May I have some explanation for this
behaviour please?
Many thanks,
Quang
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
Loading...