Diego Avesani
2018-08-03 17:23:47 UTC
Dear all,
I am experiencing a strange error.
In my code I use three group communications:
MPI_COMM_WORLD
MPI_MASTERS_COMM
LOCAL_COMM
which have in common some CPUs.
when I run my code as
mpirun -np 4 --oversubscribe ./MPIHyperStrem
I have no problem, while when I run it as
mpirun -np 4 --oversubscribe ./MPIHyperStrem
sometimes it crushes and sometimes not.
It seems that all is linked to
CALL MPI_REDUCE(QTS(tstep,:), QTS(tstep,:), nNode, MPI_DOUBLE_PRECISION,
MPI_SUM, 0, MPI_LOCAL_COMM, iErr)
which works with in local.
What do you think? Can you please suggestion some debug test?
Is a problem related to local communications?
Thanks
Diego
I am experiencing a strange error.
In my code I use three group communications:
MPI_COMM_WORLD
MPI_MASTERS_COMM
LOCAL_COMM
which have in common some CPUs.
when I run my code as
mpirun -np 4 --oversubscribe ./MPIHyperStrem
I have no problem, while when I run it as
mpirun -np 4 --oversubscribe ./MPIHyperStrem
sometimes it crushes and sometimes not.
It seems that all is linked to
CALL MPI_REDUCE(QTS(tstep,:), QTS(tstep,:), nNode, MPI_DOUBLE_PRECISION,
MPI_SUM, 0, MPI_LOCAL_COMM, iErr)
which works with in local.
What do you think? Can you please suggestion some debug test?
Is a problem related to local communications?
Thanks
Diego