Discussion:
[OMPI users] Invalid results with OpenMPI on Ubuntu Artful because of --enable-heterogeneous
Gilles Gouaillardet
2017-11-13 15:46:16 UTC
Permalink
Xavier,

thanks for the report, i will have a look at it.

is the bug triggered by MPI_ANY_SOURCE ?
/* e.g. does it work if you MPI_Irecv(..., myrank, ...) ? */


Unless ubuntu wants out of the box support between heterogeneous nodes
(for example x86_64 and ppc64),
there is little to no point in configuring Open MPI with the
--enable-heterogeneous option */


Cheers,

Gilles
Dear all,
I want to share with you the follow issue with the OpenMPI shipped with the
latest Ubuntu Artful. It is OpenMPI 2.1.1 compiled with option
--enable-heterogeneous.
Looking at this issue https://github.com/open-mpi/ompi/issues/171, it
appears that this option is broken and should not be used.
This option is being used in Debian/Ubuntu since 2010
(http://changelogs.ubuntu.com/changelogs/pool/universe/o/openmpi/openmpi_2.1.1-6/changelog)
and is still used so far. Apparently, nobody complained so far.
However, now I complain :-)
I've found a simple example for which this option causes invalid results in
OpenMPI.
int A = 666, B = 42;
MPI_Irecv(&A, 1, MPI_INT, MPI_ANY_SOURCE, tag, comm, &req);
MPI_Send(&B, 1, MPI_INT, my_rank, tag, comm);
MPI_Wait(&req, &status);
# After that, when compiled with --enable-heterogeneous, we have A != B
This happens with just a single process. The full example is in attachment
(to be run with "mpirun -n 1 ./bug_openmpi_artful").
I extracted and simplified the code from the Zoltan library with which I
initially noticed the issue.
I find it annoying that Ubuntu distributes a broken OpenMPI.
I've also tested OpenMPI 2.1.1, 2.1.2 and 3.0.0 and using
--enable-heterogeneous causes the bug systematically.
- To share with you this small example in case you want to debug it
- What is the status of issue https://github.com/open-mpi/ompi/issues/171 ?
Is this option still considered broken?
If yes, I encourage you to remove it or mark as deprecated to avoid this
kind of mistake in the future.
- To get the feedback of OpenMPI developers on the use of this option, which
might convince the Debian/Ubuntu maintainer to remove this flag.
I have opened a bug on Ubuntu for it
https://bugs.launchpad.net/ubuntu/+source/openmpi/+bug/1731938
Thanks!
Xavier
--
Dr Xavier BESSERON
Research associate
FSTC, University of Luxembourg
Campus Belval, Office MNO E04 0415-040
Phone: +352 46 66 44 5418
http://luxdem.uni.lu/
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Gilles Gouaillardet
2017-11-13 18:56:07 UTC
Permalink
Xavier,

i confirm there is a bug when using MPI_ANY_SOURCE with Open MPI
configure'd with --enable-heterogeneous

i made https://github.com/open-mpi/ompi/pull/4501 in order to fix
that, and will merge and backport once reviewed


Cheers,

Gilles

On Mon, Nov 13, 2017 at 8:46 AM, Gilles Gouaillardet
Post by Gilles Gouaillardet
Xavier,
thanks for the report, i will have a look at it.
is the bug triggered by MPI_ANY_SOURCE ?
/* e.g. does it work if you MPI_Irecv(..., myrank, ...) ? */
Unless ubuntu wants out of the box support between heterogeneous nodes
(for example x86_64 and ppc64),
there is little to no point in configuring Open MPI with the
--enable-heterogeneous option */
Cheers,
Gilles
Dear all,
I want to share with you the follow issue with the OpenMPI shipped with the
latest Ubuntu Artful. It is OpenMPI 2.1.1 compiled with option
--enable-heterogeneous.
Looking at this issue https://github.com/open-mpi/ompi/issues/171, it
appears that this option is broken and should not be used.
This option is being used in Debian/Ubuntu since 2010
(http://changelogs.ubuntu.com/changelogs/pool/universe/o/openmpi/openmpi_2.1.1-6/changelog)
and is still used so far. Apparently, nobody complained so far.
However, now I complain :-)
I've found a simple example for which this option causes invalid results in
OpenMPI.
int A = 666, B = 42;
MPI_Irecv(&A, 1, MPI_INT, MPI_ANY_SOURCE, tag, comm, &req);
MPI_Send(&B, 1, MPI_INT, my_rank, tag, comm);
MPI_Wait(&req, &status);
# After that, when compiled with --enable-heterogeneous, we have A != B
This happens with just a single process. The full example is in attachment
(to be run with "mpirun -n 1 ./bug_openmpi_artful").
I extracted and simplified the code from the Zoltan library with which I
initially noticed the issue.
I find it annoying that Ubuntu distributes a broken OpenMPI.
I've also tested OpenMPI 2.1.1, 2.1.2 and 3.0.0 and using
--enable-heterogeneous causes the bug systematically.
- To share with you this small example in case you want to debug it
- What is the status of issue https://github.com/open-mpi/ompi/issues/171 ?
Is this option still considered broken?
If yes, I encourage you to remove it or mark as deprecated to avoid this
kind of mistake in the future.
- To get the feedback of OpenMPI developers on the use of this option, which
might convince the Debian/Ubuntu maintainer to remove this flag.
I have opened a bug on Ubuntu for it
https://bugs.launchpad.net/ubuntu/+source/openmpi/+bug/1731938
Thanks!
Xavier
--
Dr Xavier BESSERON
Research associate
FSTC, University of Luxembourg
Campus Belval, Office MNO E04 0415-040
Phone: +352 46 66 44 5418
http://luxdem.uni.lu/
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Xavier Besseron
2017-11-16 11:51:03 UTC
Permalink
Thanks for looking at it!

Apparently, someone requested support for heterogeneous machines long time
ago:
https://bugs.launchpad.net/ubuntu/+source/openmpi/+bug/419074


Xavier



On Mon, Nov 13, 2017 at 7:56 PM, Gilles Gouaillardet <
Post by Gilles Gouaillardet
Xavier,
i confirm there is a bug when using MPI_ANY_SOURCE with Open MPI
configure'd with --enable-heterogeneous
i made https://github.com/open-mpi/ompi/pull/4501 in order to fix
that, and will merge and backport once reviewed
Cheers,
Gilles
On Mon, Nov 13, 2017 at 8:46 AM, Gilles Gouaillardet
Post by Gilles Gouaillardet
Xavier,
thanks for the report, i will have a look at it.
is the bug triggered by MPI_ANY_SOURCE ?
/* e.g. does it work if you MPI_Irecv(..., myrank, ...) ? */
Unless ubuntu wants out of the box support between heterogeneous nodes
(for example x86_64 and ppc64),
there is little to no point in configuring Open MPI with the
--enable-heterogeneous option */
Cheers,
Gilles
Dear all,
I want to share with you the follow issue with the OpenMPI shipped with
the
Post by Gilles Gouaillardet
latest Ubuntu Artful. It is OpenMPI 2.1.1 compiled with option
--enable-heterogeneous.
Looking at this issue https://github.com/open-mpi/ompi/issues/171, it
appears that this option is broken and should not be used.
This option is being used in Debian/Ubuntu since 2010
(http://changelogs.ubuntu.com/changelogs/pool/universe/o/
openmpi/openmpi_2.1.1-6/changelog)
Post by Gilles Gouaillardet
and is still used so far. Apparently, nobody complained so far.
However, now I complain :-)
I've found a simple example for which this option causes invalid
results in
Post by Gilles Gouaillardet
OpenMPI.
int A = 666, B = 42;
MPI_Irecv(&A, 1, MPI_INT, MPI_ANY_SOURCE, tag, comm, &req);
MPI_Send(&B, 1, MPI_INT, my_rank, tag, comm);
MPI_Wait(&req, &status);
# After that, when compiled with --enable-heterogeneous, we have A != B
This happens with just a single process. The full example is in
attachment
Post by Gilles Gouaillardet
(to be run with "mpirun -n 1 ./bug_openmpi_artful").
I extracted and simplified the code from the Zoltan library with which I
initially noticed the issue.
I find it annoying that Ubuntu distributes a broken OpenMPI.
I've also tested OpenMPI 2.1.1, 2.1.2 and 3.0.0 and using
--enable-heterogeneous causes the bug systematically.
- To share with you this small example in case you want to debug it
- What is the status of issue https://github.com/open-mpi/
ompi/issues/171 ?
Post by Gilles Gouaillardet
Is this option still considered broken?
If yes, I encourage you to remove it or mark as deprecated to avoid this
kind of mistake in the future.
- To get the feedback of OpenMPI developers on the use of this option,
which
Post by Gilles Gouaillardet
might convince the Debian/Ubuntu maintainer to remove this flag.
I have opened a bug on Ubuntu for it
https://bugs.launchpad.net/ubuntu/+source/openmpi/+bug/1731938
Thanks!
Xavier
--
Dr Xavier BESSERON
Research associate
FSTC, University of Luxembourg
Campus Belval, Office MNO E04 0415-040
Phone: +352 46 66 44 5418
http://luxdem.uni.lu/
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Loading...