Florian Lindner
2018-04-13 13:34:34 UTC
Hello,
I have this piece of code
PtrRequest MPICommunication::aSend(double *itemsToSend, int size, int rankReceiver)
{
rankReceiver = rankReceiver - _rankOffset;
int comsize = -1;
MPI_Comm_size(communicator(rankReceiver), &comsize);
TRACE(size, rank(rankReceiver), comsize);
MPI_Request request;
MPI_Isend(itemsToSend,
size,
MPI_DOUBLE,
rank(rankReceiver),
0,
communicator(rankReceiver),
&request);
return PtrRequest(new MPIRequest(request));
}
While there are quite some calls you don't know, it's basically a wrapper around Isend.
The communicator returned by communicator(rankReceiver) is an inter-communicator!
The TRACE call prints:
[1,1]<stdout>:(1) 14:30:04 [com::MPICommunication]:104 in aSend: Entering aSend
[1,1]<stdout>: Argument 0: size == 50
[1,1]<stdout>: Argument 1: rank(rankReceiver) == 1
[1,1]<stdout>: Argument 2: comsize == 2
[1,0]<stdout>:(0) 14:30:04 [com::MPICommunication]:104 in aSend: Entering aSend
[1,0]<stdout>: Argument 0: size == 48
[1,0]<stdout>: Argument 1: rank(rankReceiver) == 0
[1,0]<stdout>: Argument 2: comsize == 2
So, on rank 1 we send to rank = 1 on a communicator with size = 2.
Still, rank 1 crashes with:
[neon:80361] *** An error occurred in MPI_Isend
[neon:80361] *** reported by process [1052966913,1]
[neon:80361] *** on communicator
[neon:80361] *** MPI_ERR_RANK: invalid rank
[neon:80361] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[neon:80361] *** and potentially your MPI job)
your collegues from mpich print
[1] Fatal error in PMPI_Isend: Invalid rank, error stack:
[1] PMPI_Isend(149): MPI_Isend(buf=0x560ddeb02100, count=49, MPI_DOUBLE, dest=1, tag=0, comm=0x84000005,
request=0x7ffd528989c0) failed
[1] PMPI_Isend(97).: Invalid rank has value 1 but must be nonnegative and less than 1
[0] Fatal error in PMPI_Isend: Invalid rank, error stack:
[0] PMPI_Isend(149): MPI_Isend(buf=0x564b74c9edd8, count=1, MPI_DOUBLE, dest=1, tag=0, comm=0x84000006,
request=0x7ffe5848d9f0) failed
[0] PMPI_Isend(97).: Invalid rank has value 1 but must be nonnegative and less than 1
but MPI_Comm_size also returns 2.
Do you have any idea where to look to find out what is going wrong here? Esp. with the communicator being an inter-com.
Best Thanks,
Florian
I have this piece of code
PtrRequest MPICommunication::aSend(double *itemsToSend, int size, int rankReceiver)
{
rankReceiver = rankReceiver - _rankOffset;
int comsize = -1;
MPI_Comm_size(communicator(rankReceiver), &comsize);
TRACE(size, rank(rankReceiver), comsize);
MPI_Request request;
MPI_Isend(itemsToSend,
size,
MPI_DOUBLE,
rank(rankReceiver),
0,
communicator(rankReceiver),
&request);
return PtrRequest(new MPIRequest(request));
}
While there are quite some calls you don't know, it's basically a wrapper around Isend.
The communicator returned by communicator(rankReceiver) is an inter-communicator!
The TRACE call prints:
[1,1]<stdout>:(1) 14:30:04 [com::MPICommunication]:104 in aSend: Entering aSend
[1,1]<stdout>: Argument 0: size == 50
[1,1]<stdout>: Argument 1: rank(rankReceiver) == 1
[1,1]<stdout>: Argument 2: comsize == 2
[1,0]<stdout>:(0) 14:30:04 [com::MPICommunication]:104 in aSend: Entering aSend
[1,0]<stdout>: Argument 0: size == 48
[1,0]<stdout>: Argument 1: rank(rankReceiver) == 0
[1,0]<stdout>: Argument 2: comsize == 2
So, on rank 1 we send to rank = 1 on a communicator with size = 2.
Still, rank 1 crashes with:
[neon:80361] *** An error occurred in MPI_Isend
[neon:80361] *** reported by process [1052966913,1]
[neon:80361] *** on communicator
[neon:80361] *** MPI_ERR_RANK: invalid rank
[neon:80361] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[neon:80361] *** and potentially your MPI job)
your collegues from mpich print
[1] Fatal error in PMPI_Isend: Invalid rank, error stack:
[1] PMPI_Isend(149): MPI_Isend(buf=0x560ddeb02100, count=49, MPI_DOUBLE, dest=1, tag=0, comm=0x84000005,
request=0x7ffd528989c0) failed
[1] PMPI_Isend(97).: Invalid rank has value 1 but must be nonnegative and less than 1
[0] Fatal error in PMPI_Isend: Invalid rank, error stack:
[0] PMPI_Isend(149): MPI_Isend(buf=0x564b74c9edd8, count=1, MPI_DOUBLE, dest=1, tag=0, comm=0x84000006,
request=0x7ffe5848d9f0) failed
[0] PMPI_Isend(97).: Invalid rank has value 1 but must be nonnegative and less than 1
but MPI_Comm_size also returns 2.
Do you have any idea where to look to find out what is going wrong here? Esp. with the communicator being an inter-com.
Best Thanks,
Florian