Joseph Schuchart
2017-11-02 03:49:49 UTC
All,
I came across what I consider another issue regarding progress in Open
MPI: consider one process (P1) polling locally on a regular window (W1)
for a local value to change (using MPI_Win_lock+MPI_Get+MPI_Win_unlock)
while a second process (P2) tries to read from a memory location in a
dynamic window (W2) on process P1 (using MPI_Rget+MPI_Wait, other
combinations affected as well). P2 will later update the memory location
waited on by P1. However, the read on the dynamic window stalls as the
(local) read on W1 on P1 does not trigger progress on the dynamic window
W2, causing the application to deadlock.
It is my understanding that process P1 should guarantee progress on any
communication it is involved in, irregardless of the window or window
type, and thus the communication should succeed. Is this assumption
correct? Or is P1 required to access W2 as well to ensure progress? I
can trigger progress on W2 on P1 by adding a call to MPI_Iprobe but that
seems like a hack to me. Also, if both W1 and W2 are regular (allocated)
windows the communication succeeds.
I am attaching a small reproducer, tested with Open MPI release 3.0.0 on
a single GNU/Linux node (Linux Mint 18.2, gcc 5.4.1, Linux
4.10.0-38-generic).
Many thanks in advance!
Joseph
I came across what I consider another issue regarding progress in Open
MPI: consider one process (P1) polling locally on a regular window (W1)
for a local value to change (using MPI_Win_lock+MPI_Get+MPI_Win_unlock)
while a second process (P2) tries to read from a memory location in a
dynamic window (W2) on process P1 (using MPI_Rget+MPI_Wait, other
combinations affected as well). P2 will later update the memory location
waited on by P1. However, the read on the dynamic window stalls as the
(local) read on W1 on P1 does not trigger progress on the dynamic window
W2, causing the application to deadlock.
It is my understanding that process P1 should guarantee progress on any
communication it is involved in, irregardless of the window or window
type, and thus the communication should succeed. Is this assumption
correct? Or is P1 required to access W2 as well to ensure progress? I
can trigger progress on W2 on P1 by adding a call to MPI_Iprobe but that
seems like a hack to me. Also, if both W1 and W2 are regular (allocated)
windows the communication succeeds.
I am attaching a small reproducer, tested with Open MPI release 3.0.0 on
a single GNU/Linux node (Linux Mint 18.2, gcc 5.4.1, Linux
4.10.0-38-generic).
Many thanks in advance!
Joseph
--
Dipl.-Inf. Joseph Schuchart
High Performance Computing Center Stuttgart (HLRS)
Nobelstr. 19
D-70569 Stuttgart
Tel.: +49(0)711-68565890
Fax: +49(0)711-6856832
E-Mail: ***@hlrs.de
Dipl.-Inf. Joseph Schuchart
High Performance Computing Center Stuttgart (HLRS)
Nobelstr. 19
D-70569 Stuttgart
Tel.: +49(0)711-68565890
Fax: +49(0)711-6856832
E-Mail: ***@hlrs.de