Discussion:
[OMPI users] False positives and even failure with OpenMPI and memchecker
Yvan Fournier
2016-11-05 19:22:41 UTC
Permalink
Hello,

Yes, as I had hinted in the my message, I observed the bug in an irregular
manner.

Glad to see it could be fixed so quickly (it affects 2.0 too). I had observed it
for some time, but only recently took the time to make a proper simplified case
and investigate. Guess I should have submitted the issue sooner...

Best regards,

Yvan Fournier
Message: 5
Date: Sat, 5 Nov 2016 22:08:32 +0900
Subject: Re: [OMPI users] False positives and even failure with Open
MPI and memchecker
Content-Type: text/plain; charset=UTF-8
that really looks like a bug
if you rewrite your program with
  MPI_Sendrecv(&l, 1, MPI_INT, rank_next, tag, &l_prev, 1, MPI_INT,
rank_prev, tag, MPI_COMM_WORLD, &status);
or even
  MPI_Irecv(&l_prev, 1, MPI_INT, rank_prev, tag, MPI_COMM_WORLD, &req);
  MPI_Send(&l, 1, MPI_INT, rank_next, tag, MPI_COMM_WORLD);
  MPI_Wait(&req, &status);
then there is no more valgrind warning
iirc, Open MPI marks the receive buffer as invalid memory, so it can
check only MPI subroutine updates it. it looks like a step is missing
in the case of MPI_Recv()
Cheers,
Gilles
On Sat, Nov 5, 2016 at 9:48 PM, Gilles Gouaillardet
Hi,
note your printf line is missing.
if you printf l_prev, then the valgrind error occurs in all variants
at first glance, it looks like a false positive, and i will investigate it
Cheers,
Gilles
Hello,
I have observed what seems to be false positives running under Valgrind
when Open MPI is build with --enable-memchecker
(at least with versions 1.10.4 and 2.0.1).
Attached is a simple test case (extracted from larger code) that sends one
int to rank r+1, and receives from rank r-1
(using MPI_COMM_NULL to handle ranks below 0 or above comm size).
~/opt/openmpi-2.0/bin/mpicc -DVARIANT_1 vg_mpi.c
~/opt/openmpi-2.0/bin/mpiexec -output-filename vg_log -n 2 valgrind
./a.out
==8382== Invalid read of size 4
==8382==    at 0x400A00: main (in /home/yvan/test/a.out)
==8382==  Address 0xffefffe70 is on thread 1's stack
==8382==  in frame #0, created by main (???:)
~/opt/openmpi-2.0/bin/mpicc -DVARIANT_2 vg_mpi.c
~/opt/openmpi-2.0/bin/mpiexec -output-filename vg_log -n 2 valgrind
./a.out
==8322== Invalid read of size 4
==8322==    at 0x400A6C: main (in /home/yvan/test/a.out)
==8322==  Address 0xcb6f9a0 is 0 bytes inside a block of size 4 alloc'd
==8322==    at 0x4C29BBE: malloc (in /usr/lib/valgrind/vgpreload_memcheck-
amd64-linux.so)
==8322==    by 0x400998: main (in /home/yvan/test/a.out)
I get no error for the default variant (no -D_VARIANT...) with either Open
MPI 2.0.1, or 1.10.4,
but de get an error similar to variant 1 on the parent code from which the
example was extracted...
is given below. Running under Valgrind's gdb server, for the parent code
of variant 1,
it even seems the value received on rank 1 is uninitialized, then Valgrind
complains
with the given message.
The code fails to work as intended when run under Valgrind when OpenMPI is
built with --enable-memchecker,
while it works fine when run with the same build but not under Valgrind,
or when run under Valgrind with Open MPI built without memchecker.
I'm running under Arch Linux (whosed packaged Open MPI 1.10.4 is built
with memchecker enabled,
rendering it unusable under Valgrind).
Did anybody else encounter this type of issue, or I does my code contain
an obvious mistake that I am missing ?
I initially though of possible alignment issues, but saw nothing in the
standard that requires that,
and the "malloc"-base variant exhibits the same behavior,while I assume
alignment to 64-bits for allocated arrays is the default.
Best regards,
  Yvan Fournier
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
------------------------------
Message: 6
Date: Sat, 5 Nov 2016 23:12:54 +0900
Subject: Re: [OMPI users] False positives and even failure with Open
MPI and memchecker
Content-Type: text/plain; charset="utf-8"
so it seems we took some shortcuts in pml/ob1
the attached patch (for the v1.10 branch) should fix this issue
Cheers
Gilles
On Sat, Nov 5, 2016 at 10:08 PM, Gilles Gouaillardet
that really looks like a bug
if you rewrite your program with
  MPI_Sendrecv(&l, 1, MPI_INT, rank_next, tag, &l_prev, 1, MPI_INT,
rank_prev, tag, MPI_COMM_WORLD, &status);
or even
  MPI_Irecv(&l_prev, 1, MPI_INT, rank_prev, tag, MPI_COMM_WORLD, &req);
  MPI_Send(&l, 1, MPI_INT, rank_next, tag, MPI_COMM_WORLD);
  MPI_Wait(&req, &status);
then there is no more valgrind warning
iirc, Open MPI marks the receive buffer as invalid memory, so it can
check only MPI subroutine updates it. it looks like a step is missing
in the case of MPI_Recv()
Cheers,
Gilles
On Sat, Nov 5, 2016 at 9:48 PM, Gilles Gouaillardet
Hi,
note your printf line is missing.
if you printf l_prev, then the valgrind error occurs in all variants
at first glance, it looks like a false positive, and i will investigate it
Cheers,
Gilles
Hello,
I have observed what seems to be false positives running under Valgrind
when Open MPI is build with --enable-memchecker
(at least with versions 1.10.4 and 2.0.1).
Attached is a simple test case (extracted from larger code) that sends
one int to rank r+1, and receives from rank r-1
(using MPI_COMM_NULL to handle ranks below 0 or above comm size).
~/opt/openmpi-2.0/bin/mpicc -DVARIANT_1 vg_mpi.c
~/opt/openmpi-2.0/bin/mpiexec -output-filename vg_log -n 2 valgrind
./a.out
==8382== Invalid read of size 4
==8382==    at 0x400A00: main (in /home/yvan/test/a.out)
==8382==  Address 0xffefffe70 is on thread 1's stack
==8382==  in frame #0, created by main (???:)
~/opt/openmpi-2.0/bin/mpicc -DVARIANT_2 vg_mpi.c
~/opt/openmpi-2.0/bin/mpiexec -output-filename vg_log -n 2 valgrind
./a.out
==8322== Invalid read of size 4
==8322==    at 0x400A6C: main (in /home/yvan/test/a.out)
==8322==  Address 0xcb6f9a0 is 0 bytes inside a block of size 4 alloc'd
==8322==    at 0x4C29BBE: malloc (in
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==8322==    by 0x400998: main (in /home/yvan/test/a.out)
I get no error for the default variant (no -D_VARIANT...) with either
Open MPI 2.0.1, or 1.10.4,
but de get an error similar to variant 1 on the parent code from which
the example was extracted...
is given below. Running under Valgrind's gdb server, for the parent code
of variant 1,
it even seems the value received on rank 1 is uninitialized, then
Valgrind complains
with the given message.
The code fails to work as intended when run under Valgrind when OpenMPI
is built with --enable-memchecker,
while it works fine when run with the same build but not under Valgrind,
or when run under Valgrind with Open MPI built without memchecker.
I'm running under Arch Linux (whosed packaged Open MPI 1.10.4 is built
with memchecker enabled,
rendering it unusable under Valgrind).
Did anybody else encounter this type of issue, or I does my code contain
an obvious mistake that I am missing ?
I initially though of possible alignment issues, but saw nothing in the
standard that requires that,
and the "malloc"-base variant exhibits the same behavior,while I assume
alignment to 64-bits for allocated arrays is the default.
Best regards,
  Yvan Fournier
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
-------------- next part --------------
diff --git a/ompi/mca/pml/ob1/pml_ob1_irecv.c
b/ompi/mca/pml/ob1/pml_ob1_irecv.c
index 56826a2..97a6a38 100644
--- a/ompi/mca/pml/ob1/pml_ob1_irecv.c
+++ b/ompi/mca/pml/ob1/pml_ob1_irecv.c
@@ -30,6 +30,7 @@
 #include "pml_ob1_recvfrag.h"
 #include "ompi/peruse/peruse-internal.h"
 #include "ompi/message/message.h"
+#include "ompi/memchecker.h"
 
 mca_pml_ob1_recv_request_t *mca_pml_ob1_recvreq = NULL;
 
@@ -128,6 +129,17 @@ int mca_pml_ob1_recv(void *addr,
 
     rc = recvreq->req_recv.req_base.req_ompi.req_status.MPI_ERROR;
 
+    if (recvreq->req_recv.req_base.req_pml_complete) {
+        /* make buffer defined when the request is compeleted,
+           and before releasing the objects. */
+        MEMCHECKER(
+            memchecker_call(&opal_memchecker_base_mem_defined,
+                            recvreq->req_recv.req_base.req_addr,
+                            recvreq->req_recv.req_base.req_count,
+                            recvreq->req_recv.req_base.req_datatype);
+        );
+    }
+
 #if OMPI_ENABLE_THREAD_MULTIPLE
     MCA_PML_OB1_RECV_REQUEST_RETURN(recvreq);
 #else
------------------------------
Subject: Digest Footer
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
------------------------------
End of users Digest, Vol 3645, Issue 1
**************************************
y***@free.fr
2018-01-07 00:43:16 UTC
Permalink
Hello,

I obtain false positives with OpenMPI when memcheck is enabled, using OpenMPI 3.0.0

This is similar to an issue I had reported and had been fixed in Nov. 2016, but affects MPI_Isend/MPI_Irecv instead of MPI_Send/MPI_Recv.
I had not done much additional testing on my application using memchecker since, so probably may have missed remaining issues at the time.

In the attached test (which has 2 optional variants relating to whether the send and receive buffers are allocated on the stack or heap, but exhibit the same basic issue), I have (running "mpicc vg_ompi_isend_irecv.c && -g mpiexec -n 2 ./a.out"):

==19651== Memcheck, a memory error detector
==19651== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==19651== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==19651== Command: ./a.out
==19651==
==19650== Thread 3:
==19650== Syscall param epoll_pwait(sigmask) points to unaddressable byte(s)
==19650== at 0x5470596: epoll_pwait (in /usr/lib/libc-2.26.so)
==19650== by 0x5A5A9FA: epoll_dispatch (epoll.c:407)
==19650== by 0x5A5EA9A: opal_libevent2022_event_base_loop (event.c:1630)
==19650== by 0x94C96ED: progress_engine (in /home/yvan/opt/openmpi-3.0/lib/openmpi/mca_pmix_pmix2x.so)
==19650== by 0x5163089: start_thread (in /usr/lib/libpthread-2.26.so)
==19650== by 0x547042E: clone (in /usr/lib/libc-2.26.so)
==19650== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==19650==
==19651== Thread 3:
==19651== Syscall param epoll_pwait(sigmask) points to unaddressable byte(s)
==19651== at 0x5470596: epoll_pwait (in /usr/lib/libc-2.26.so)
==19651== by 0x5A5A9FA: epoll_dispatch (epoll.c:407)
==19651== by 0x5A5EA9A: opal_libevent2022_event_base_loop (event.c:1630)
==19651== by 0x94C96ED: progress_engine (in /home/yvan/opt/openmpi-3.0/lib/openmpi/mca_pmix_pmix2x.so)
==19651== by 0x5163089: start_thread (in /usr/lib/libpthread-2.26.so)
==19651== by 0x547042E: clone (in /usr/lib/libc-2.26.so)
==19651== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==19651==
==19650== Thread 1:
==19650== Invalid read of size 2
==19650== at 0x4C33BA0: memmove (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==19650== by 0x5A27C85: opal_convertor_pack (in /home/yvan/opt/openmpi-3.0/lib/libopen-pal.so.40.0.0)
==19650== by 0xD177EF1: mca_btl_vader_sendi (in /home/yvan/opt/openmpi-3.0/lib/openmpi/mca_btl_vader.so)
==19650== by 0xE1A7F31: mca_pml_ob1_send_inline.constprop.4 (in /home/yvan/opt/openmpi-3.0/lib/openmpi/mca_pml_ob1.so)
==19650== by 0xE1A8711: mca_pml_ob1_isend (in /home/yvan/opt/openmpi-3.0/lib/openmpi/mca_pml_ob1.so)
==19650== by 0x4EB4C83: PMPI_Isend (in /home/yvan/opt/openmpi-3.0/lib/libmpi.so.40.0.0)
==19650== by 0x108B24: main (vg_ompi_isend_irecv.c:63)
==19650== Address 0x1ffefffcc4 is on thread 1's stack
==19650== in frame #6, created by main (vg_ompi_isend_irecv.c:7)

The first 2 warnings seem to relate to initialization, so are not a big issue, but the last one occurs whenever I use MPI_Isend, so they are a more important issue.

Using a version built without --enable-memchecker, I also have the two initialization warnings, but not the warning from MPI_Isend...

Best regards,

Yvan Fournier
George Bosilca
2018-01-07 00:56:56 UTC
Permalink
Hi Yvan,

You mention a test. Can you make it available either on the mailing list, a
github issue or privately ?

Thanks,
George.
Post by Yvan Fournier
Hello,
I obtain false positives with OpenMPI when memcheck is enabled, using OpenMPI 3.0.0
This is similar to an issue I had reported and had been fixed in Nov.
2016, but affects MPI_Isend/MPI_Irecv instead of MPI_Send/MPI_Recv.
I had not done much additional testing on my application using memchecker
since, so probably may have missed remaining issues at the time.
In the attached test (which has 2 optional variants relating to whether
the send and receive buffers are allocated on the stack or heap, but
exhibit the same basic issue), I have (running "mpicc vg_ompi_isend_irecv.c
==19651== Memcheck, a memory error detector
==19651== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==19651== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==19651== Command: ./a.out
==19651==
==19650== Syscall param epoll_pwait(sigmask) points to unaddressable byte(s)
==19650== at 0x5470596: epoll_pwait (in /usr/lib/libc-2.26.so)
==19650== by 0x5A5A9FA: epoll_dispatch (epoll.c:407)
==19650== by 0x5A5EA9A: opal_libevent2022_event_base_loop
(event.c:1630)
==19650== by 0x94C96ED: progress_engine (in /home/yvan/opt/openmpi-3.0/
lib/openmpi/mca_pmix_pmix2x.so)
==19650== by 0x5163089: start_thread (in /usr/lib/libpthread-2.26.so)
==19650== by 0x547042E: clone (in /usr/lib/libc-2.26.so)
==19650== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==19650==
==19651== Syscall param epoll_pwait(sigmask) points to unaddressable byte(s)
==19651== at 0x5470596: epoll_pwait (in /usr/lib/libc-2.26.so)
==19651== by 0x5A5A9FA: epoll_dispatch (epoll.c:407)
==19651== by 0x5A5EA9A: opal_libevent2022_event_base_loop
(event.c:1630)
==19651== by 0x94C96ED: progress_engine (in /home/yvan/opt/openmpi-3.0/
lib/openmpi/mca_pmix_pmix2x.so)
==19651== by 0x5163089: start_thread (in /usr/lib/libpthread-2.26.so)
==19651== by 0x547042E: clone (in /usr/lib/libc-2.26.so)
==19651== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==19651==
==19650== Invalid read of size 2
==19650== at 0x4C33BA0: memmove (in /usr/lib/valgrind/vgpreload_
memcheck-amd64-linux.so)
==19650== by 0x5A27C85: opal_convertor_pack (in
/home/yvan/opt/openmpi-3.0/lib/libopen-pal.so.40.0.0)
==19650== by 0xD177EF1: mca_btl_vader_sendi (in
/home/yvan/opt/openmpi-3.0/lib/openmpi/mca_btl_vader.so)
==19650== by 0xE1A7F31: mca_pml_ob1_send_inline.constprop.4 (in
/home/yvan/opt/openmpi-3.0/lib/openmpi/mca_pml_ob1.so)
==19650== by 0xE1A8711: mca_pml_ob1_isend (in
/home/yvan/opt/openmpi-3.0/lib/openmpi/mca_pml_ob1.so)
==19650== by 0x4EB4C83: PMPI_Isend (in /home/yvan/opt/openmpi-3.0/
lib/libmpi.so.40.0.0)
==19650== by 0x108B24: main (vg_ompi_isend_irecv.c:63)
==19650== Address 0x1ffefffcc4 is on thread 1's stack
==19650== in frame #6, created by main (vg_ompi_isend_irecv.c:7)
The first 2 warnings seem to relate to initialization, so are not a big
issue, but the last one occurs whenever I use MPI_Isend, so they are a more
important issue.
Using a version built without --enable-memchecker, I also have the two
initialization warnings, but not the warning from MPI_Isend...
Best regards,
Yvan Fournier
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
y***@free.fr
2018-01-07 00:52:04 UTC
Permalink
Hello,

Sorry, I forgot the attached test case in my previous message... :(

Best regards,

Yvan Fournier

----- Mail transferred -----
From: "yvan fournier" <***@free.fr>
To: ***@lists.open-mpi.org
Sent: Sunday January 7 2018 01:43:16
Object: False positives with OpenMPI and memchecker

Hello,

I obtain false positives with OpenMPI when memcheck is enabled, using OpenMPI 3.0.0

This is similar to an issue I had reported and had been fixed in Nov. 2016, but affects MPI_Isend/MPI_Irecv instead of MPI_Send/MPI_Recv.
I had not done much additional testing on my application using memchecker since, so probably may have missed remaining issues at the time.

In the attached test (which has 2 optional variants relating to whether the send and receive buffers are allocated on the stack or heap, but exhibit the same basic issue), I have (running "mpicc vg_ompi_isend_irecv.c && -g mpiexec -n 2 ./a.out"):

==19651== Memcheck, a memory error detector
==19651== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==19651== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==19651== Command: ./a.out
==19651==
==19650== Thread 3:
==19650== Syscall param epoll_pwait(sigmask) points to unaddressable byte(s)
==19650== at 0x5470596: epoll_pwait (in /usr/lib/libc-2.26.so)
==19650== by 0x5A5A9FA: epoll_dispatch (epoll.c:407)
==19650== by 0x5A5EA9A: opal_libevent2022_event_base_loop (event.c:1630)
==19650== by 0x94C96ED: progress_engine (in /home/yvan/opt/openmpi-3.0/lib/openmpi/mca_pmix_pmix2x.so)
==19650== by 0x5163089: start_thread (in /usr/lib/libpthread-2.26.so)
==19650== by 0x547042E: clone (in /usr/lib/libc-2.26.so)
==19650== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==19650==
==19651== Thread 3:
==19651== Syscall param epoll_pwait(sigmask) points to unaddressable byte(s)
==19651== at 0x5470596: epoll_pwait (in /usr/lib/libc-2.26.so)
==19651== by 0x5A5A9FA: epoll_dispatch (epoll.c:407)
==19651== by 0x5A5EA9A: opal_libevent2022_event_base_loop (event.c:1630)
==19651== by 0x94C96ED: progress_engine (in /home/yvan/opt/openmpi-3.0/lib/openmpi/mca_pmix_pmix2x.so)
==19651== by 0x5163089: start_thread (in /usr/lib/libpthread-2.26.so)
==19651== by 0x547042E: clone (in /usr/lib/libc-2.26.so)
==19651== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==19651==
==19650== Thread 1:
==19650== Invalid read of size 2
==19650== at 0x4C33BA0: memmove (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==19650== by 0x5A27C85: opal_convertor_pack (in /home/yvan/opt/openmpi-3.0/lib/libopen-pal.so.40.0.0)
==19650== by 0xD177EF1: mca_btl_vader_sendi (in /home/yvan/opt/openmpi-3.0/lib/openmpi/mca_btl_vader.so)
==19650== by 0xE1A7F31: mca_pml_ob1_send_inline.constprop.4 (in /home/yvan/opt/openmpi-3.0/lib/openmpi/mca_pml_ob1.so)
==19650== by 0xE1A8711: mca_pml_ob1_isend (in /home/yvan/opt/openmpi-3.0/lib/openmpi/mca_pml_ob1.so)
==19650== by 0x4EB4C83: PMPI_Isend (in /home/yvan/opt/openmpi-3.0/lib/libmpi.so.40.0.0)
==19650== by 0x108B24: main (vg_ompi_isend_irecv.c:63)
==19650== Address 0x1ffefffcc4 is on thread 1's stack
==19650== in frame #6, created by main (vg_ompi_isend_irecv.c:7)

The first 2 warnings seem to relate to initialization, so are not a big issue, but the last one occurs whenever I use MPI_Isend, so they are a more important issue.

Using a version built without --enable-memchecker, I also have the two initialization warnings, but not the warning from MPI_Isend...

Best regards,

Yvan Fournier
y***@free.fr
2018-01-07 01:26:04 UTC
Permalink
Hello,

Answering myself here: checking the revision history, commits
3b8b8c52c519f64cb3ff147db49fcac7cbd0e7d7 or 66c9485e77f7da9a212ae67c88a21f95f13e6652 (in master) seem to relate to this, so I checked using the latest downloadable 3.0.x nightly release, and do not reproduce the issue anymore...

Sorry for the (too-late) report...

Yvan

----- Mail original -----
From: "yvan fournier" <***@free.fr>
To: ***@lists.open-mpi.org
Sent: Sunday January 7 2018 01:52:04
Object: Re: False positives with OpenMPI and memchecker (with attachment)

Hello,

Sorry, I forgot the attached test case in my previous message... :(

Best regards,

Yvan Fournier

----- Mail transferred -----
From: "yvan fournier" <***@free.fr>
To: ***@lists.open-mpi.org
Sent: Sunday January 7 2018 01:43:16
Object: False positives with OpenMPI and memchecker

Hello,

I obtain false positives with OpenMPI when memcheck is enabled, using OpenMPI 3.0.0

This is similar to an issue I had reported and had been fixed in Nov. 2016, but affects MPI_Isend/MPI_Irecv instead of MPI_Send/MPI_Recv.
I had not done much additional testing on my application using memchecker since, so probably may have missed remaining issues at the time.

In the attached test (which has 2 optional variants relating to whether the send and receive buffers are allocated on the stack or heap, but exhibit the same basic issue), I have (running "mpicc vg_ompi_isend_irecv.c && -g mpiexec -n 2 ./a.out"):

==19651== Memcheck, a memory error detector
==19651== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==19651== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==19651== Command: ./a.out
==19651==
==19650== Thread 3:
==19650== Syscall param epoll_pwait(sigmask) points to unaddressable byte(s)
==19650== at 0x5470596: epoll_pwait (in /usr/lib/libc-2.26.so)
==19650== by 0x5A5A9FA: epoll_dispatch (epoll.c:407)
==19650== by 0x5A5EA9A: opal_libevent2022_event_base_loop (event.c:1630)
==19650== by 0x94C96ED: progress_engine (in /home/yvan/opt/openmpi-3.0/lib/openmpi/mca_pmix_pmix2x.so)
==19650== by 0x5163089: start_thread (in /usr/lib/libpthread-2.26.so)
==19650== by 0x547042E: clone (in /usr/lib/libc-2.26.so)
==19650== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==19650==
==19651== Thread 3:
==19651== Syscall param epoll_pwait(sigmask) points to unaddressable byte(s)
==19651== at 0x5470596: epoll_pwait (in /usr/lib/libc-2.26.so)
==19651== by 0x5A5A9FA: epoll_dispatch (epoll.c:407)
==19651== by 0x5A5EA9A: opal_libevent2022_event_base_loop (event.c:1630)
==19651== by 0x94C96ED: progress_engine (in /home/yvan/opt/openmpi-3.0/lib/openmpi/mca_pmix_pmix2x.so)
==19651== by 0x5163089: start_thread (in /usr/lib/libpthread-2.26.so)
==19651== by 0x547042E: clone (in /usr/lib/libc-2.26.so)
==19651== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==19651==
==19650== Thread 1:
==19650== Invalid read of size 2
==19650== at 0x4C33BA0: memmove (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==19650== by 0x5A27C85: opal_convertor_pack (in /home/yvan/opt/openmpi-3.0/lib/libopen-pal.so.40.0.0)
==19650== by 0xD177EF1: mca_btl_vader_sendi (in /home/yvan/opt/openmpi-3.0/lib/openmpi/mca_btl_vader.so)
==19650== by 0xE1A7F31: mca_pml_ob1_send_inline.constprop.4 (in /home/yvan/opt/openmpi-3.0/lib/openmpi/mca_pml_ob1.so)
==19650== by 0xE1A8711: mca_pml_ob1_isend (in /home/yvan/opt/openmpi-3.0/lib/openmpi/mca_pml_ob1.so)
==19650== by 0x4EB4C83: PMPI_Isend (in /home/yvan/opt/openmpi-3.0/lib/libmpi.so.40.0.0)
==19650== by 0x108B24: main (vg_ompi_isend_irecv.c:63)
==19650== Address 0x1ffefffcc4 is on thread 1's stack
==19650== in frame #6, created by main (vg_ompi_isend_irecv.c:7)

The first 2 warnings seem to relate to initialization, so are not a big issue, but the last one occurs whenever I use MPI_Isend, so they are a more important issue.

Using a version built without --enable-memchecker, I also have the two initialization warnings, but not the warning from MPI_Isend...

Best regards,

Yvan Fournier

Loading...