Discussion:
[OMPI users] Invalid rank despite com size large enough
Florian Lindner
2018-04-13 13:34:34 UTC
Permalink
Hello,

I have this piece of code

PtrRequest MPICommunication::aSend(double *itemsToSend, int size, int rankReceiver)
{
rankReceiver = rankReceiver - _rankOffset;
int comsize = -1;
MPI_Comm_size(communicator(rankReceiver), &comsize);
TRACE(size, rank(rankReceiver), comsize);


MPI_Request request;
MPI_Isend(itemsToSend,
size,
MPI_DOUBLE,
rank(rankReceiver),
0,
communicator(rankReceiver),
&request);

return PtrRequest(new MPIRequest(request));
}

While there are quite some calls you don't know, it's basically a wrapper around Isend.

The communicator returned by communicator(rankReceiver) is an inter-communicator!

The TRACE call prints:

[1,1]<stdout>:(1) 14:30:04 [com::MPICommunication]:104 in aSend: Entering aSend
[1,1]<stdout>: Argument 0: size == 50
[1,1]<stdout>: Argument 1: rank(rankReceiver) == 1
[1,1]<stdout>: Argument 2: comsize == 2
[1,0]<stdout>:(0) 14:30:04 [com::MPICommunication]:104 in aSend: Entering aSend
[1,0]<stdout>: Argument 0: size == 48
[1,0]<stdout>: Argument 1: rank(rankReceiver) == 0
[1,0]<stdout>: Argument 2: comsize == 2

So, on rank 1 we send to rank = 1 on a communicator with size = 2.

Still, rank 1 crashes with:

[neon:80361] *** An error occurred in MPI_Isend
[neon:80361] *** reported by process [1052966913,1]
[neon:80361] *** on communicator
[neon:80361] *** MPI_ERR_RANK: invalid rank
[neon:80361] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[neon:80361] *** and potentially your MPI job)

your collegues from mpich print

[1] Fatal error in PMPI_Isend: Invalid rank, error stack:
[1] PMPI_Isend(149): MPI_Isend(buf=0x560ddeb02100, count=49, MPI_DOUBLE, dest=1, tag=0, comm=0x84000005,
request=0x7ffd528989c0) failed
[1] PMPI_Isend(97).: Invalid rank has value 1 but must be nonnegative and less than 1

[0] Fatal error in PMPI_Isend: Invalid rank, error stack:
[0] PMPI_Isend(149): MPI_Isend(buf=0x564b74c9edd8, count=1, MPI_DOUBLE, dest=1, tag=0, comm=0x84000006,
request=0x7ffe5848d9f0) failed
[0] PMPI_Isend(97).: Invalid rank has value 1 but must be nonnegative and less than 1

but MPI_Comm_size also returns 2.

Do you have any idea where to look to find out what is going wrong here? Esp. with the communicator being an inter-com.

Best Thanks,
Florian
Nathan Hjelm
2018-04-13 13:41:20 UTC
Permalink
Try using MPI_Comm_remotr_size. As this is an intercommunicator that will give the number of ranks for send/recv.
Post by Florian Lindner
Hello,
I have this piece of code
PtrRequest MPICommunication::aSend(double *itemsToSend, int size, int rankReceiver)
{
rankReceiver = rankReceiver - _rankOffset;
int comsize = -1;
MPI_Comm_size(communicator(rankReceiver), &comsize);
TRACE(size, rank(rankReceiver), comsize);
MPI_Request request;
MPI_Isend(itemsToSend,
size,
MPI_DOUBLE,
rank(rankReceiver),
0,
communicator(rankReceiver),
&request);
return PtrRequest(new MPIRequest(request));
}
While there are quite some calls you don't know, it's basically a wrapper around Isend.
The communicator returned by communicator(rankReceiver) is an inter-communicator!
[1,1]<stdout>:(1) 14:30:04 [com::MPICommunication]:104 in aSend: Entering aSend
[1,1]<stdout>: Argument 0: size == 50
[1,1]<stdout>: Argument 1: rank(rankReceiver) == 1
[1,1]<stdout>: Argument 2: comsize == 2
[1,0]<stdout>:(0) 14:30:04 [com::MPICommunication]:104 in aSend: Entering aSend
[1,0]<stdout>: Argument 0: size == 48
[1,0]<stdout>: Argument 1: rank(rankReceiver) == 0
[1,0]<stdout>: Argument 2: comsize == 2
So, on rank 1 we send to rank = 1 on a communicator with size = 2.
[neon:80361] *** An error occurred in MPI_Isend
[neon:80361] *** reported by process [1052966913,1]
[neon:80361] *** on communicator
[neon:80361] *** MPI_ERR_RANK: invalid rank
[neon:80361] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[neon:80361] *** and potentially your MPI job)
your collegues from mpich print
[1] PMPI_Isend(149): MPI_Isend(buf=0x560ddeb02100, count=49, MPI_DOUBLE, dest=1, tag=0, comm=0x84000005,
request=0x7ffd528989c0) failed
[1] PMPI_Isend(97).: Invalid rank has value 1 but must be nonnegative and less than 1
[0] PMPI_Isend(149): MPI_Isend(buf=0x564b74c9edd8, count=1, MPI_DOUBLE, dest=1, tag=0, comm=0x84000006,
request=0x7ffe5848d9f0) failed
[0] PMPI_Isend(97).: Invalid rank has value 1 but must be nonnegative and less than 1
but MPI_Comm_size also returns 2.
Do you have any idea where to look to find out what is going wrong here? Esp. with the communicator being an inter-com.
Best Thanks,
Florian
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Nathan Hjelm
2018-04-13 13:41:53 UTC
Permalink
Err. MPI_Comm_remote_size.
Post by Nathan Hjelm
Try using MPI_Comm_remotr_size. As this is an intercommunicator that will give the number of ranks for send/recv.
Post by Florian Lindner
Hello,
I have this piece of code
PtrRequest MPICommunication::aSend(double *itemsToSend, int size, int rankReceiver)
{
rankReceiver = rankReceiver - _rankOffset;
int comsize = -1;
MPI_Comm_size(communicator(rankReceiver), &comsize);
TRACE(size, rank(rankReceiver), comsize);
MPI_Request request;
MPI_Isend(itemsToSend,
size,
MPI_DOUBLE,
rank(rankReceiver),
0,
communicator(rankReceiver),
&request);
return PtrRequest(new MPIRequest(request));
}
While there are quite some calls you don't know, it's basically a wrapper around Isend.
The communicator returned by communicator(rankReceiver) is an inter-communicator!
[1,1]<stdout>:(1) 14:30:04 [com::MPICommunication]:104 in aSend: Entering aSend
[1,1]<stdout>: Argument 0: size == 50
[1,1]<stdout>: Argument 1: rank(rankReceiver) == 1
[1,1]<stdout>: Argument 2: comsize == 2
[1,0]<stdout>:(0) 14:30:04 [com::MPICommunication]:104 in aSend: Entering aSend
[1,0]<stdout>: Argument 0: size == 48
[1,0]<stdout>: Argument 1: rank(rankReceiver) == 0
[1,0]<stdout>: Argument 2: comsize == 2
So, on rank 1 we send to rank = 1 on a communicator with size = 2.
[neon:80361] *** An error occurred in MPI_Isend
[neon:80361] *** reported by process [1052966913,1]
[neon:80361] *** on communicator
[neon:80361] *** MPI_ERR_RANK: invalid rank
[neon:80361] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[neon:80361] *** and potentially your MPI job)
your collegues from mpich print
[1] PMPI_Isend(149): MPI_Isend(buf=0x560ddeb02100, count=49, MPI_DOUBLE, dest=1, tag=0, comm=0x84000005,
request=0x7ffd528989c0) failed
[1] PMPI_Isend(97).: Invalid rank has value 1 but must be nonnegative and less than 1
[0] PMPI_Isend(149): MPI_Isend(buf=0x564b74c9edd8, count=1, MPI_DOUBLE, dest=1, tag=0, comm=0x84000006,
request=0x7ffe5848d9f0) failed
[0] PMPI_Isend(97).: Invalid rank has value 1 but must be nonnegative and less than 1
but MPI_Comm_size also returns 2.
Do you have any idea where to look to find out what is going wrong here? Esp. with the communicator being an inter-com.
Best Thanks,
Florian
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Florian Lindner
2018-04-13 14:05:21 UTC
Permalink
Post by Nathan Hjelm
Err. MPI_Comm_remote_size.
Ah, thanks! I thought at that MPI_Comm_size returns the number of remote ranks. remote_size returns 1, so now at least
the error message is consistent.

Best,
Florian
Post by Nathan Hjelm
Post by Nathan Hjelm
Try using MPI_Comm_remotr_size. As this is an intercommunicator that will give the number of ranks for send/recv.
Post by Florian Lindner
Hello,
I have this piece of code
PtrRequest MPICommunication::aSend(double *itemsToSend, int size, int rankReceiver)
{
rankReceiver = rankReceiver - _rankOffset;
int comsize = -1;
MPI_Comm_size(communicator(rankReceiver), &comsize);
TRACE(size, rank(rankReceiver), comsize);
MPI_Request request;
MPI_Isend(itemsToSend,
size,
MPI_DOUBLE,
rank(rankReceiver),
0,
communicator(rankReceiver),
&request);
return PtrRequest(new MPIRequest(request));
}
While there are quite some calls you don't know, it's basically a wrapper around Isend.
The communicator returned by communicator(rankReceiver) is an inter-communicator!
[1,1]<stdout>:(1) 14:30:04 [com::MPICommunication]:104 in aSend: Entering aSend
[1,1]<stdout>: Argument 0: size == 50
[1,1]<stdout>: Argument 1: rank(rankReceiver) == 1
[1,1]<stdout>: Argument 2: comsize == 2
[1,0]<stdout>:(0) 14:30:04 [com::MPICommunication]:104 in aSend: Entering aSend
[1,0]<stdout>: Argument 0: size == 48
[1,0]<stdout>: Argument 1: rank(rankReceiver) == 0
[1,0]<stdout>: Argument 2: comsize == 2
So, on rank 1 we send to rank = 1 on a communicator with size = 2.
[neon:80361] *** An error occurred in MPI_Isend
[neon:80361] *** reported by process [1052966913,1]
[neon:80361] *** on communicator
[neon:80361] *** MPI_ERR_RANK: invalid rank
[neon:80361] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[neon:80361] *** and potentially your MPI job)
your collegues from mpich print
[1] PMPI_Isend(149): MPI_Isend(buf=0x560ddeb02100, count=49, MPI_DOUBLE, dest=1, tag=0, comm=0x84000005,
request=0x7ffd528989c0) failed
[1] PMPI_Isend(97).: Invalid rank has value 1 but must be nonnegative and less than 1
[0] PMPI_Isend(149): MPI_Isend(buf=0x564b74c9edd8, count=1, MPI_DOUBLE, dest=1, tag=0, comm=0x84000006,
request=0x7ffe5848d9f0) failed
[0] PMPI_Isend(97).: Invalid rank has value 1 but must be nonnegative and less than 1
but MPI_Comm_size also returns 2.
Do you have any idea where to look to find out what is going wrong here? Esp. with the communicator being an inter-com.
Best Thanks,
Florian
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Loading...