Discussion:
[OMPI users] MPI_Sendrecv datatype memory bug ?
Yann Jobic
2016-11-24 14:38:24 UTC
Permalink
Hi all,

I'm going crazy about a possible bug in my code. I'm using a derived mpi
datatype in a sendrecv function.
The problem is that the memory footprint of my code is growing as time
increases.
The problem is not showing with a regular datatype, as MPI_DOUBLE.
I don't have this problem for openmpi 1.8.4, but it's present for 1.10.1
and 2.0.1

The key parts of the code are (i'm using a 1D array with a macro in
order to be 3D) :

Definition of the datatype:

MPI_Type_vector( Ny, 1, Nx, MPI_DOUBLE, &mpi.MPI_COL );
MPI_Type_commit( &mpi.MPI_COL ) ;

And the sendrecv part:

MPI_Sendrecv( &(thebigone[_(1,0,k)]) , 1 , mpi.MPI_COL ,
mpi.left , 3, \
&(thebigone[_(Nx-1,0,k)]) , 1 , mpi.MPI_COL ,
mpi.right, 3, \
mpi.com, &mpi.stat );

Is it coming from my code ?

I isolated the communications in a small code (500 lines). I can give it
in order to reproduce the problem.

Thanks,

Yann


---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast.
https://www.avast.com/antivirus
Gilles Gouaillardet
2016-11-25 10:51:48 UTC
Permalink
Yann,

Please post the test case that evidences the issue.

What is the minimal config required to reproduce it
(E.g. number of nodes and tasks per node)

If more than one node, which interconnect are you using ?
Out of curiosity, what if you
mpirun --mca mpi_leave_pinned 0 ...
or
mpirun --mca btl tcp,self --mca pml ob1 ...


Cheers,

Gilles
Post by Yann Jobic
Hi all,
I'm going crazy about a possible bug in my code. I'm using a derived mpi
datatype in a sendrecv function.
The problem is that the memory footprint of my code is growing as time
increases.
The problem is not showing with a regular datatype, as MPI_DOUBLE.
I don't have this problem for openmpi 1.8.4, but it's present for 1.10.1
and 2.0.1
The key parts of the code are (i'm using a 1D array with a macro in
MPI_Type_vector( Ny, 1, Nx, MPI_DOUBLE, &mpi.MPI_COL );
MPI_Type_commit( &mpi.MPI_COL ) ;
MPI_Sendrecv( &(thebigone[_(1,0,k)]) , 1 , mpi.MPI_COL ,
mpi.left , 3, \
&(thebigone[_(Nx-1,0,k)]) , 1 , mpi.MPI_COL ,
mpi.right, 3, \
mpi.com, &mpi.stat );
Is it coming from my code ?
I isolated the communications in a small code (500 lines). I can give it
in order to reproduce the problem.
Thanks,
Yann
---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast.
https://www.avast.com/antivirus
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
Loading...