Discussion:
[OMPI users] opal_pmix_base_select failed for master and 4.0.0
Siegmar Gross
2018-10-02 07:00:38 UTC
Permalink
Hi,

yesterday I've installed openmpi-v4.0.x-201809290241-a7e275c and
openmpi-master-201805080348-b39bbfb on my "SUSE Linux Enterprise Server
12.3 (x86_64)" with Sun C 5.15, gcc 6.4.0, Intel icc 18.0.3, and Portland
Group pgcc 18.4-0. Unfortunately, I get the following error for all seven
installed versions (Sun C couldn't built master as I mentioned in another
email).


loki hello_1 118 mpiexec -np 4 --host loki:2,nfs2:2 hello_1_mpi
[loki:11423] [[45859,0],0] ORTE_ERROR_LOG: Not found in file
../../../../../openmpi-v4.0.x-201809290241-a7e275c/orte/mca/ess/hnp/ess_hnp_module.c
at line 321
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

opal_pmix_base_select failed
--> Returned value Not found (-13) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
loki hello_1 119



I would be grateful, if somebody can fix the problem. Do you need anything
else? Thank you very much for any help in advance.


Kind regards

Siegmar
Ralph H Castain
2018-10-02 13:36:49 UTC
Permalink
Looks like PMIx failed to build - can you send the config.log?
Post by Siegmar Gross
Hi,
yesterday I've installed openmpi-v4.0.x-201809290241-a7e275c and
openmpi-master-201805080348-b39bbfb on my "SUSE Linux Enterprise Server
12.3 (x86_64)" with Sun C 5.15, gcc 6.4.0, Intel icc 18.0.3, and Portland
Group pgcc 18.4-0. Unfortunately, I get the following error for all seven
installed versions (Sun C couldn't built master as I mentioned in another
email).
loki hello_1 118 mpiexec -np 4 --host loki:2,nfs2:2 hello_1_mpi
[loki:11423] [[45859,0],0] ORTE_ERROR_LOG: Not found in file ../../../../../openmpi-v4.0.x-201809290241-a7e275c/orte/mca/ess/hnp/ess_hnp_module.c at line 321
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
opal_pmix_base_select failed
--> Returned value Not found (-13) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
loki hello_1 119
I would be grateful, if somebody can fix the problem. Do you need anything
else? Thank you very much for any help in advance.
Kind regards
Siegmar
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Ralph H Castain
2018-10-02 17:48:06 UTC
Permalink
So the problem is here when configuring the internal PMIx code:

configure:3383: === HWLOC
configure:36189: checking for hwloc in
configure:36201: result: Could not find internal/lib or internal/lib64
configure:36203: error: Can not continue

Can you confirm that HWLOC built? I believe we require it, but perhaps something is different about this environment.
Post by Ralph H Castain
Looks like PMIx failed to build - can you send the config.log?
Post by Siegmar Gross
Hi,
yesterday I've installed openmpi-v4.0.x-201809290241-a7e275c and
openmpi-master-201805080348-b39bbfb on my "SUSE Linux Enterprise Server
12.3 (x86_64)" with Sun C 5.15, gcc 6.4.0, Intel icc 18.0.3, and Portland
Group pgcc 18.4-0. Unfortunately, I get the following error for all seven
installed versions (Sun C couldn't built master as I mentioned in another
email).
loki hello_1 118 mpiexec -np 4 --host loki:2,nfs2:2 hello_1_mpi
[loki:11423] [[45859,0],0] ORTE_ERROR_LOG: Not found in file ../../../../../openmpi-v4.0.x-201809290241-a7e275c/orte/mca/ess/hnp/ess_hnp_module.c at line 321
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
opal_pmix_base_select failed
--> Returned value Not found (-13) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
loki hello_1 119
I would be grateful, if somebody can fix the problem. Do you need anything
else? Thank you very much for any help in advance.
Kind regards
Siegmar
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Ralph H Castain
2018-10-02 21:25:14 UTC
Permalink
Hi Siegmar

I honestly have no idea - for some reason, the PMIx component isn’t seeing the internal hwloc code in your environment.

Jeff, Brice - any ideas?
Hi Ralph,
how can I confirm that HWLOC built? Some hwloc files are available
in the built directory.
loki openmpi-master-201809290304-73075b8-Linux.x86_64.64_gcc 111 find . -name '*hwloc*'
./opal/mca/btl/usnic/.deps/btl_usnic_hwloc.Plo
./opal/mca/hwloc
./opal/mca/hwloc/external/.deps/hwloc_external_component.Plo
./opal/mca/hwloc/base/hwloc_base_frame.lo
./opal/mca/hwloc/base/.deps/hwloc_base_dt.Plo
./opal/mca/hwloc/base/.deps/hwloc_base_maffinity.Plo
./opal/mca/hwloc/base/.deps/hwloc_base_frame.Plo
./opal/mca/hwloc/base/.deps/hwloc_base_util.Plo
./opal/mca/hwloc/base/hwloc_base_dt.lo
./opal/mca/hwloc/base/hwloc_base_util.lo
./opal/mca/hwloc/base/hwloc_base_maffinity.lo
./opal/mca/hwloc/base/.libs/hwloc_base_util.o
./opal/mca/hwloc/base/.libs/hwloc_base_dt.o
./opal/mca/hwloc/base/.libs/hwloc_base_maffinity.o
./opal/mca/hwloc/base/.libs/hwloc_base_frame.o
./opal/mca/hwloc/.libs/libmca_hwloc.la
./opal/mca/hwloc/.libs/libmca_hwloc.a
./opal/mca/hwloc/libmca_hwloc.la
./opal/mca/hwloc/hwloc201
./opal/mca/hwloc/hwloc201/.deps/hwloc201_component.Plo
./opal/mca/hwloc/hwloc201/hwloc201_component.lo
./opal/mca/hwloc/hwloc201/hwloc
./opal/mca/hwloc/hwloc201/hwloc/include/hwloc
./opal/mca/hwloc/hwloc201/hwloc/hwloc
./opal/mca/hwloc/hwloc201/hwloc/hwloc/libhwloc_embedded.la
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_pci_la-topology-pci.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_gl_la-topology-gl.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_cuda_la-topology-cuda.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_xml_libxml_la-topology-xml-libxml.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_opencl_la-topology-opencl.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_nvml_la-topology-nvml.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.libs/libhwloc_embedded.la
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.libs/libhwloc_embedded.a
./opal/mca/hwloc/hwloc201/.libs/hwloc201_component.o
./opal/mca/hwloc/hwloc201/.libs/libmca_hwloc_hwloc201.la
./opal/mca/hwloc/hwloc201/.libs/libmca_hwloc_hwloc201.a
./opal/mca/hwloc/hwloc201/libmca_hwloc_hwloc201.la
./orte/mca/rtc/hwloc
./orte/mca/rtc/hwloc/rtc_hwloc.lo
./orte/mca/rtc/hwloc/.deps/rtc_hwloc.Plo
./orte/mca/rtc/hwloc/.deps/rtc_hwloc_component.Plo
./orte/mca/rtc/hwloc/mca_rtc_hwloc.la
./orte/mca/rtc/hwloc/.libs/mca_rtc_hwloc.so
./orte/mca/rtc/hwloc/.libs/mca_rtc_hwloc.la
./orte/mca/rtc/hwloc/.libs/rtc_hwloc.o
./orte/mca/rtc/hwloc/.libs/rtc_hwloc_component.o
./orte/mca/rtc/hwloc/.libs/mca_rtc_hwloc.soT
./orte/mca/rtc/hwloc/.libs/mca_rtc_hwloc.lai
./orte/mca/rtc/hwloc/rtc_hwloc_component.lo
loki openmpi-master-201809290304-73075b8-Linux.x86_64.64_gcc 112
And some files are available in the install directory.
loki openmpi-master_64_gcc 116 find . -name '*hwloc*'
./share/openmpi/help-orte-rtc-hwloc.txt
./share/openmpi/help-opal-hwloc-base.txt
./lib64/openmpi/mca_rtc_hwloc.so
./lib64/openmpi/mca_rtc_hwloc.la
loki openmpi-master_64_gcc 117
I don't see any unavailable libraries so that the only available
hwloc library should work.
loki openmpi 126 ldd -v mca_rtc_hwloc.so
linux-vdso.so.1 (0x00007ffd2df5b000)
libopen-rte.so.0 => /usr/local/openmpi-master_64_gcc/lib64/libopen-rte.so.0 (0x00007f082b7fb000)
libopen-pal.so.0 => /usr/local/openmpi-master_64_gcc/lib64/libopen-pal.so.0 (0x00007f082b493000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f082b28f000)
libudev.so.1 => /usr/lib64/libudev.so.1 (0x00007f082b06e000)
libpciaccess.so.0 => /usr/lib64/libpciaccess.so.0 (0x00007f082ae64000)
librt.so.1 => /lib64/librt.so.1 (0x00007f082ac5c000)
libm.so.6 => /lib64/libm.so.6 (0x00007f082a95f000)
libutil.so.1 => /lib64/libutil.so.1 (0x00007f082a75c000)
libz.so.1 => /lib64/libz.so.1 (0x00007f082a546000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f082a329000)
libc.so.6 => /lib64/libc.so.6 (0x00007f0829f84000)
libgcc_s.so.1 => /usr/local/gcc-8.2.0/lib64/libgcc_s.so.1 (0x00007f0829d6c000)
/lib64/ld-linux-x86-64.so.2 (0x00007f082bd24000)
libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f0829b46000)
libcap.so.2 => /lib64/libcap.so.2 (0x00007f0829941000)
libresolv.so.2 => /lib64/libresolv.so.2 (0x00007f082972a000)
libpcre.so.1 => /usr/lib64/libpcre.so.1 (0x00007f08294bb000)
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libz.so.1 (ZLIB_1.2.0) => /lib64/libz.so.1
libpthread.so.0 (GLIBC_2.3.2) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
librt.so.1 (GLIBC_2.2.5) => /lib64/librt.so.1
libgcc_s.so.1 (GCC_3.0) => /usr/local/gcc-8.2.0/lib64/libgcc_s.so.1
libgcc_s.so.1 (GCC_3.3.1) => /usr/local/gcc-8.2.0/lib64/libgcc_s.so.1
libdl.so.2 (GLIBC_2.2.5) => /lib64/libdl.so.2
libutil.so.1 (GLIBC_2.2.5) => /lib64/libutil.so.1
libudev.so.1 (LIBUDEV_183) => /usr/lib64/libudev.so.1
libm.so.6 (GLIBC_2.2.5) => /lib64/libm.so.6
libpthread.so.0 (GLIBC_2.3.4) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_2.3.2) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.6) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.7) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.2) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
librt.so.1 (GLIBC_2.2.5) => /lib64/librt.so.1
ld-linux-x86-64.so.2 (GLIBC_2.3) => /lib64/ld-linux-x86-64.so.2
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.9) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.16) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.8) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.7) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libpthread.so.0 (GLIBC_2.3.2) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_PRIVATE) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.2) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
ld-linux-x86-64.so.2 (GLIBC_2.2.5) => /lib64/ld-linux-x86-64.so.2
ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.2) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
ld-linux-x86-64.so.2 (GLIBC_2.3) => /lib64/ld-linux-x86-64.so.2
ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libdl.so.2 (GLIBC_2.2.5) => /lib64/libdl.so.2
ld-linux-x86-64.so.2 (GLIBC_2.3) => /lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.8) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.7) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.8) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
loki openmpi 127
Hopefully that helps to find the problem. I will answer your emails
tommorrow if you need anything else.
Best regards
Siegmar
Post by Ralph H Castain
configure:3383: === HWLOC
configure:36189: checking for hwloc in
configure:36201: result: Could not find internal/lib or internal/lib64
configure:36203: error: Can not continue
Can you confirm that HWLOC built? I believe we require it, but perhaps something is different about this environment.
Post by Ralph H Castain
Looks like PMIx failed to build - can you send the config.log?
Post by Siegmar Gross
Hi,
yesterday I've installed openmpi-v4.0.x-201809290241-a7e275c and
openmpi-master-201805080348-b39bbfb on my "SUSE Linux Enterprise Server
12.3 (x86_64)" with Sun C 5.15, gcc 6.4.0, Intel icc 18.0.3, and Portland
Group pgcc 18.4-0. Unfortunately, I get the following error for all seven
installed versions (Sun C couldn't built master as I mentioned in another
email).
loki hello_1 118 mpiexec -np 4 --host loki:2,nfs2:2 hello_1_mpi
[loki:11423] [[45859,0],0] ORTE_ERROR_LOG: Not found in file ../../../../../openmpi-v4.0.x-201809290241-a7e275c/orte/mca/ess/hnp/ess_hnp_module.c at line 321
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
opal_pmix_base_select failed
--> Returned value Not found (-13) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
loki hello_1 119
I would be grateful, if somebody can fix the problem. Do you need anything
else? Thank you very much for any help in advance.
Kind regards
Siegmar
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Jeff Squyres (jsquyres) via users
2018-10-02 21:50:39 UTC
Permalink
(Ralph sent me Siegmar's pmix config.log, which Siegmar sent to him off-list)

It looks like Siegmar passed --with-hwloc=internal.

Open MPI's configure understood this and did the appropriate things.
PMIX's configure didn't.

I think we need to add an adjustment into the PMIx configure.m4 in OMPI...
Post by Ralph H Castain
Hi Siegmar
I honestly have no idea - for some reason, the PMIx component isn’t seeing the internal hwloc code in your environment.
Jeff, Brice - any ideas?
Hi Ralph,
how can I confirm that HWLOC built? Some hwloc files are available
in the built directory.
loki openmpi-master-201809290304-73075b8-Linux.x86_64.64_gcc 111 find . -name '*hwloc*'
./opal/mca/btl/usnic/.deps/btl_usnic_hwloc.Plo
./opal/mca/hwloc
./opal/mca/hwloc/external/.deps/hwloc_external_component.Plo
./opal/mca/hwloc/base/hwloc_base_frame.lo
./opal/mca/hwloc/base/.deps/hwloc_base_dt.Plo
./opal/mca/hwloc/base/.deps/hwloc_base_maffinity.Plo
./opal/mca/hwloc/base/.deps/hwloc_base_frame.Plo
./opal/mca/hwloc/base/.deps/hwloc_base_util.Plo
./opal/mca/hwloc/base/hwloc_base_dt.lo
./opal/mca/hwloc/base/hwloc_base_util.lo
./opal/mca/hwloc/base/hwloc_base_maffinity.lo
./opal/mca/hwloc/base/.libs/hwloc_base_util.o
./opal/mca/hwloc/base/.libs/hwloc_base_dt.o
./opal/mca/hwloc/base/.libs/hwloc_base_maffinity.o
./opal/mca/hwloc/base/.libs/hwloc_base_frame.o
./opal/mca/hwloc/.libs/libmca_hwloc.la
./opal/mca/hwloc/.libs/libmca_hwloc.a
./opal/mca/hwloc/libmca_hwloc.la
./opal/mca/hwloc/hwloc201
./opal/mca/hwloc/hwloc201/.deps/hwloc201_component.Plo
./opal/mca/hwloc/hwloc201/hwloc201_component.lo
./opal/mca/hwloc/hwloc201/hwloc
./opal/mca/hwloc/hwloc201/hwloc/include/hwloc
./opal/mca/hwloc/hwloc201/hwloc/hwloc
./opal/mca/hwloc/hwloc201/hwloc/hwloc/libhwloc_embedded.la
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_pci_la-topology-pci.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_gl_la-topology-gl.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_cuda_la-topology-cuda.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_xml_libxml_la-topology-xml-libxml.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_opencl_la-topology-opencl.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_nvml_la-topology-nvml.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.libs/libhwloc_embedded.la
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.libs/libhwloc_embedded.a
./opal/mca/hwloc/hwloc201/.libs/hwloc201_component.o
./opal/mca/hwloc/hwloc201/.libs/libmca_hwloc_hwloc201.la
./opal/mca/hwloc/hwloc201/.libs/libmca_hwloc_hwloc201.a
./opal/mca/hwloc/hwloc201/libmca_hwloc_hwloc201.la
./orte/mca/rtc/hwloc
./orte/mca/rtc/hwloc/rtc_hwloc.lo
./orte/mca/rtc/hwloc/.deps/rtc_hwloc.Plo
./orte/mca/rtc/hwloc/.deps/rtc_hwloc_component.Plo
./orte/mca/rtc/hwloc/mca_rtc_hwloc.la
./orte/mca/rtc/hwloc/.libs/mca_rtc_hwloc.so
./orte/mca/rtc/hwloc/.libs/mca_rtc_hwloc.la
./orte/mca/rtc/hwloc/.libs/rtc_hwloc.o
./orte/mca/rtc/hwloc/.libs/rtc_hwloc_component.o
./orte/mca/rtc/hwloc/.libs/mca_rtc_hwloc.soT
./orte/mca/rtc/hwloc/.libs/mca_rtc_hwloc.lai
./orte/mca/rtc/hwloc/rtc_hwloc_component.lo
loki openmpi-master-201809290304-73075b8-Linux.x86_64.64_gcc 112
And some files are available in the install directory.
loki openmpi-master_64_gcc 116 find . -name '*hwloc*'
./share/openmpi/help-orte-rtc-hwloc.txt
./share/openmpi/help-opal-hwloc-base.txt
./lib64/openmpi/mca_rtc_hwloc.so
./lib64/openmpi/mca_rtc_hwloc.la
loki openmpi-master_64_gcc 117
I don't see any unavailable libraries so that the only available
hwloc library should work.
loki openmpi 126 ldd -v mca_rtc_hwloc.so
linux-vdso.so.1 (0x00007ffd2df5b000)
libopen-rte.so.0 => /usr/local/openmpi-master_64_gcc/lib64/libopen-rte.so.0 (0x00007f082b7fb000)
libopen-pal.so.0 => /usr/local/openmpi-master_64_gcc/lib64/libopen-pal.so.0 (0x00007f082b493000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f082b28f000)
libudev.so.1 => /usr/lib64/libudev.so.1 (0x00007f082b06e000)
libpciaccess.so.0 => /usr/lib64/libpciaccess.so.0 (0x00007f082ae64000)
librt.so.1 => /lib64/librt.so.1 (0x00007f082ac5c000)
libm.so.6 => /lib64/libm.so.6 (0x00007f082a95f000)
libutil.so.1 => /lib64/libutil.so.1 (0x00007f082a75c000)
libz.so.1 => /lib64/libz.so.1 (0x00007f082a546000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f082a329000)
libc.so.6 => /lib64/libc.so.6 (0x00007f0829f84000)
libgcc_s.so.1 => /usr/local/gcc-8.2.0/lib64/libgcc_s.so.1 (0x00007f0829d6c000)
/lib64/ld-linux-x86-64.so.2 (0x00007f082bd24000)
libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f0829b46000)
libcap.so.2 => /lib64/libcap.so.2 (0x00007f0829941000)
libresolv.so.2 => /lib64/libresolv.so.2 (0x00007f082972a000)
libpcre.so.1 => /usr/lib64/libpcre.so.1 (0x00007f08294bb000)
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libz.so.1 (ZLIB_1.2.0) => /lib64/libz.so.1
libpthread.so.0 (GLIBC_2.3.2) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
librt.so.1 (GLIBC_2.2.5) => /lib64/librt.so.1
libgcc_s.so.1 (GCC_3.0) => /usr/local/gcc-8.2.0/lib64/libgcc_s.so.1
libgcc_s.so.1 (GCC_3.3.1) => /usr/local/gcc-8.2.0/lib64/libgcc_s.so.1
libdl.so.2 (GLIBC_2.2.5) => /lib64/libdl.so.2
libutil.so.1 (GLIBC_2.2.5) => /lib64/libutil.so.1
libudev.so.1 (LIBUDEV_183) => /usr/lib64/libudev.so.1
libm.so.6 (GLIBC_2.2.5) => /lib64/libm.so.6
libpthread.so.0 (GLIBC_2.3.4) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_2.3.2) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.6) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.7) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.2) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
librt.so.1 (GLIBC_2.2.5) => /lib64/librt.so.1
ld-linux-x86-64.so.2 (GLIBC_2.3) => /lib64/ld-linux-x86-64.so.2
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.9) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.16) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.8) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.7) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libpthread.so.0 (GLIBC_2.3.2) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_PRIVATE) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.2) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
ld-linux-x86-64.so.2 (GLIBC_2.2.5) => /lib64/ld-linux-x86-64.so.2
ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.2) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
ld-linux-x86-64.so.2 (GLIBC_2.3) => /lib64/ld-linux-x86-64.so.2
ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libdl.so.2 (GLIBC_2.2.5) => /lib64/libdl.so.2
ld-linux-x86-64.so.2 (GLIBC_2.3) => /lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.8) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.7) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.8) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
loki openmpi 127
Hopefully that helps to find the problem. I will answer your emails
tommorrow if you need anything else.
Best regards
Siegmar
Post by Ralph H Castain
configure:3383: === HWLOC
configure:36189: checking for hwloc in
configure:36201: result: Could not find internal/lib or internal/lib64
configure:36203: error: Can not continue
Can you confirm that HWLOC built? I believe we require it, but perhaps something is different about this environment.
Post by Ralph H Castain
Looks like PMIx failed to build - can you send the config.log?
Post by Siegmar Gross
Hi,
yesterday I've installed openmpi-v4.0.x-201809290241-a7e275c and
openmpi-master-201805080348-b39bbfb on my "SUSE Linux Enterprise Server
12.3 (x86_64)" with Sun C 5.15, gcc 6.4.0, Intel icc 18.0.3, and Portland
Group pgcc 18.4-0. Unfortunately, I get the following error for all seven
installed versions (Sun C couldn't built master as I mentioned in another
email).
loki hello_1 118 mpiexec -np 4 --host loki:2,nfs2:2 hello_1_mpi
[loki:11423] [[45859,0],0] ORTE_ERROR_LOG: Not found in file ../../../../../openmpi-v4.0.x-201809290241-a7e275c/orte/mca/ess/hnp/ess_hnp_module.c at line 321
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
opal_pmix_base_select failed
--> Returned value Not found (-13) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
loki hello_1 119
I would be grateful, if somebody can fix the problem. Do you need anything
else? Thank you very much for any help in advance.
Kind regards
Siegmar
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
--
Jeff Squyres
***@cisco.com
Ralph H Castain
2018-10-03 18:14:20 UTC
Permalink
Jeff and I talked and believe the patch in https://github.com/open-mpi/ompi/pull/5836 should fix the problem.
Post by Jeff Squyres (jsquyres) via users
(Ralph sent me Siegmar's pmix config.log, which Siegmar sent to him off-list)
It looks like Siegmar passed --with-hwloc=internal.
Open MPI's configure understood this and did the appropriate things.
PMIX's configure didn't.
I think we need to add an adjustment into the PMIx configure.m4 in OMPI...
Post by Ralph H Castain
Hi Siegmar
I honestly have no idea - for some reason, the PMIx component isn’t seeing the internal hwloc code in your environment.
Jeff, Brice - any ideas?
Hi Ralph,
how can I confirm that HWLOC built? Some hwloc files are available
in the built directory.
loki openmpi-master-201809290304-73075b8-Linux.x86_64.64_gcc 111 find . -name '*hwloc*'
./opal/mca/btl/usnic/.deps/btl_usnic_hwloc.Plo
./opal/mca/hwloc
./opal/mca/hwloc/external/.deps/hwloc_external_component.Plo
./opal/mca/hwloc/base/hwloc_base_frame.lo
./opal/mca/hwloc/base/.deps/hwloc_base_dt.Plo
./opal/mca/hwloc/base/.deps/hwloc_base_maffinity.Plo
./opal/mca/hwloc/base/.deps/hwloc_base_frame.Plo
./opal/mca/hwloc/base/.deps/hwloc_base_util.Plo
./opal/mca/hwloc/base/hwloc_base_dt.lo
./opal/mca/hwloc/base/hwloc_base_util.lo
./opal/mca/hwloc/base/hwloc_base_maffinity.lo
./opal/mca/hwloc/base/.libs/hwloc_base_util.o
./opal/mca/hwloc/base/.libs/hwloc_base_dt.o
./opal/mca/hwloc/base/.libs/hwloc_base_maffinity.o
./opal/mca/hwloc/base/.libs/hwloc_base_frame.o
./opal/mca/hwloc/.libs/libmca_hwloc.la
./opal/mca/hwloc/.libs/libmca_hwloc.a
./opal/mca/hwloc/libmca_hwloc.la
./opal/mca/hwloc/hwloc201
./opal/mca/hwloc/hwloc201/.deps/hwloc201_component.Plo
./opal/mca/hwloc/hwloc201/hwloc201_component.lo
./opal/mca/hwloc/hwloc201/hwloc
./opal/mca/hwloc/hwloc201/hwloc/include/hwloc
./opal/mca/hwloc/hwloc201/hwloc/hwloc
./opal/mca/hwloc/hwloc201/hwloc/hwloc/libhwloc_embedded.la
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_pci_la-topology-pci.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_gl_la-topology-gl.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_cuda_la-topology-cuda.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_xml_libxml_la-topology-xml-libxml.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_opencl_la-topology-opencl.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_nvml_la-topology-nvml.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.libs/libhwloc_embedded.la
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.libs/libhwloc_embedded.a
./opal/mca/hwloc/hwloc201/.libs/hwloc201_component.o
./opal/mca/hwloc/hwloc201/.libs/libmca_hwloc_hwloc201.la
./opal/mca/hwloc/hwloc201/.libs/libmca_hwloc_hwloc201.a
./opal/mca/hwloc/hwloc201/libmca_hwloc_hwloc201.la
./orte/mca/rtc/hwloc
./orte/mca/rtc/hwloc/rtc_hwloc.lo
./orte/mca/rtc/hwloc/.deps/rtc_hwloc.Plo
./orte/mca/rtc/hwloc/.deps/rtc_hwloc_component.Plo
./orte/mca/rtc/hwloc/mca_rtc_hwloc.la
./orte/mca/rtc/hwloc/.libs/mca_rtc_hwloc.so
./orte/mca/rtc/hwloc/.libs/mca_rtc_hwloc.la
./orte/mca/rtc/hwloc/.libs/rtc_hwloc.o
./orte/mca/rtc/hwloc/.libs/rtc_hwloc_component.o
./orte/mca/rtc/hwloc/.libs/mca_rtc_hwloc.soT
./orte/mca/rtc/hwloc/.libs/mca_rtc_hwloc.lai
./orte/mca/rtc/hwloc/rtc_hwloc_component.lo
loki openmpi-master-201809290304-73075b8-Linux.x86_64.64_gcc 112
And some files are available in the install directory.
loki openmpi-master_64_gcc 116 find . -name '*hwloc*'
./share/openmpi/help-orte-rtc-hwloc.txt
./share/openmpi/help-opal-hwloc-base.txt
./lib64/openmpi/mca_rtc_hwloc.so
./lib64/openmpi/mca_rtc_hwloc.la
loki openmpi-master_64_gcc 117
I don't see any unavailable libraries so that the only available
hwloc library should work.
loki openmpi 126 ldd -v mca_rtc_hwloc.so
linux-vdso.so.1 (0x00007ffd2df5b000)
libopen-rte.so.0 => /usr/local/openmpi-master_64_gcc/lib64/libopen-rte.so.0 (0x00007f082b7fb000)
libopen-pal.so.0 => /usr/local/openmpi-master_64_gcc/lib64/libopen-pal.so.0 (0x00007f082b493000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f082b28f000)
libudev.so.1 => /usr/lib64/libudev.so.1 (0x00007f082b06e000)
libpciaccess.so.0 => /usr/lib64/libpciaccess.so.0 (0x00007f082ae64000)
librt.so.1 => /lib64/librt.so.1 (0x00007f082ac5c000)
libm.so.6 => /lib64/libm.so.6 (0x00007f082a95f000)
libutil.so.1 => /lib64/libutil.so.1 (0x00007f082a75c000)
libz.so.1 => /lib64/libz.so.1 (0x00007f082a546000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f082a329000)
libc.so.6 => /lib64/libc.so.6 (0x00007f0829f84000)
libgcc_s.so.1 => /usr/local/gcc-8.2.0/lib64/libgcc_s.so.1 (0x00007f0829d6c000)
/lib64/ld-linux-x86-64.so.2 (0x00007f082bd24000)
libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f0829b46000)
libcap.so.2 => /lib64/libcap.so.2 (0x00007f0829941000)
libresolv.so.2 => /lib64/libresolv.so.2 (0x00007f082972a000)
libpcre.so.1 => /usr/lib64/libpcre.so.1 (0x00007f08294bb000)
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libz.so.1 (ZLIB_1.2.0) => /lib64/libz.so.1
libpthread.so.0 (GLIBC_2.3.2) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
librt.so.1 (GLIBC_2.2.5) => /lib64/librt.so.1
libgcc_s.so.1 (GCC_3.0) => /usr/local/gcc-8.2.0/lib64/libgcc_s.so.1
libgcc_s.so.1 (GCC_3.3.1) => /usr/local/gcc-8.2.0/lib64/libgcc_s.so.1
libdl.so.2 (GLIBC_2.2.5) => /lib64/libdl.so.2
libutil.so.1 (GLIBC_2.2.5) => /lib64/libutil.so.1
libudev.so.1 (LIBUDEV_183) => /usr/lib64/libudev.so.1
libm.so.6 (GLIBC_2.2.5) => /lib64/libm.so.6
libpthread.so.0 (GLIBC_2.3.4) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_2.3.2) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.6) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.7) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.2) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
librt.so.1 (GLIBC_2.2.5) => /lib64/librt.so.1
ld-linux-x86-64.so.2 (GLIBC_2.3) => /lib64/ld-linux-x86-64.so.2
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.9) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.16) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.8) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.7) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libpthread.so.0 (GLIBC_2.3.2) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_PRIVATE) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.2) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
ld-linux-x86-64.so.2 (GLIBC_2.2.5) => /lib64/ld-linux-x86-64.so.2
ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.2) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
ld-linux-x86-64.so.2 (GLIBC_2.3) => /lib64/ld-linux-x86-64.so.2
ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libdl.so.2 (GLIBC_2.2.5) => /lib64/libdl.so.2
ld-linux-x86-64.so.2 (GLIBC_2.3) => /lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.8) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.7) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.8) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
loki openmpi 127
Hopefully that helps to find the problem. I will answer your emails
tommorrow if you need anything else.
Best regards
Siegmar
Post by Ralph H Castain
configure:3383: === HWLOC
configure:36189: checking for hwloc in
configure:36201: result: Could not find internal/lib or internal/lib64
configure:36203: error: Can not continue
Can you confirm that HWLOC built? I believe we require it, but perhaps something is different about this environment.
Post by Ralph H Castain
Looks like PMIx failed to build - can you send the config.log?
Post by Siegmar Gross
Hi,
yesterday I've installed openmpi-v4.0.x-201809290241-a7e275c and
openmpi-master-201805080348-b39bbfb on my "SUSE Linux Enterprise Server
12.3 (x86_64)" with Sun C 5.15, gcc 6.4.0, Intel icc 18.0.3, and Portland
Group pgcc 18.4-0. Unfortunately, I get the following error for all seven
installed versions (Sun C couldn't built master as I mentioned in another
email).
loki hello_1 118 mpiexec -np 4 --host loki:2,nfs2:2 hello_1_mpi
[loki:11423] [[45859,0],0] ORTE_ERROR_LOG: Not found in file ../../../../../openmpi-v4.0.x-201809290241-a7e275c/orte/mca/ess/hnp/ess_hnp_module.c at line 321
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
opal_pmix_base_select failed
--> Returned value Not found (-13) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
loki hello_1 119
I would be grateful, if somebody can fix the problem. Do you need anything
else? Thank you very much for any help in advance.
Kind regards
Siegmar
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
--
Jeff Squyres
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Siegmar Gross
2018-10-05 09:04:22 UTC
Permalink
Hi Ralph, hi Jeff,
Post by Ralph H Castain
Jeff and I talked and believe the patch in https://github.com/open-mpi/ompi/pull/5836 should fix the problem.
Today I've installed openmpi-master-201810050304-5f1c940 and
openmpi-v4.0.x-201810050241-c079666. Unfortunately, I still get the
same error for all seven versions that I was able to build.

loki hello_1 114 mpicc --showme
gcc -I/usr/local/openmpi-master_64_gcc/include -fexceptions -pthread -std=c11
-m64 -Wl,-rpath -Wl,/usr/local/openmpi-master_64_gcc/lib64
-Wl,--enable-new-dtags -L/usr/local/openmpi-master_64_gcc/lib64 -lmpi

loki hello_1 115 ompi_info | grep "Open MPI repo revision"
Open MPI repo revision: v2.x-dev-6262-g5f1c940

loki hello_1 116 mpicc hello_1_mpi.c

loki hello_1 117 mpiexec -np 2 a.out
[loki:25575] [[64603,0],0] ORTE_ERROR_LOG: Not found in file
../../../../../openmpi-master-201810050304-5f1c940/orte/mca/ess/hnp/ess_hnp_module.c
at line 320
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

opal_pmix_base_select failed
--> Returned value Not found (-13) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
loki hello_1 118


I don't know, if you have already applied your suggested patch or if the
error message is still from a version without that patch. Do you need
anything else?


Best regards

Siegmar
Post by Ralph H Castain
Post by Jeff Squyres (jsquyres) via users
(Ralph sent me Siegmar's pmix config.log, which Siegmar sent to him off-list)
It looks like Siegmar passed --with-hwloc=internal.
Open MPI's configure understood this and did the appropriate things.
PMIX's configure didn't.
I think we need to add an adjustment into the PMIx configure.m4 in OMPI...
Post by Ralph H Castain
Hi Siegmar
I honestly have no idea - for some reason, the PMIx component isn’t seeing the internal hwloc code in your environment.
Jeff, Brice - any ideas?
Hi Ralph,
how can I confirm that HWLOC built? Some hwloc files are available
in the built directory.
loki openmpi-master-201809290304-73075b8-Linux.x86_64.64_gcc 111 find . -name '*hwloc*'
./opal/mca/btl/usnic/.deps/btl_usnic_hwloc.Plo
./opal/mca/hwloc
./opal/mca/hwloc/external/.deps/hwloc_external_component.Plo
./opal/mca/hwloc/base/hwloc_base_frame.lo
./opal/mca/hwloc/base/.deps/hwloc_base_dt.Plo
./opal/mca/hwloc/base/.deps/hwloc_base_maffinity.Plo
./opal/mca/hwloc/base/.deps/hwloc_base_frame.Plo
./opal/mca/hwloc/base/.deps/hwloc_base_util.Plo
./opal/mca/hwloc/base/hwloc_base_dt.lo
./opal/mca/hwloc/base/hwloc_base_util.lo
./opal/mca/hwloc/base/hwloc_base_maffinity.lo
./opal/mca/hwloc/base/.libs/hwloc_base_util.o
./opal/mca/hwloc/base/.libs/hwloc_base_dt.o
./opal/mca/hwloc/base/.libs/hwloc_base_maffinity.o
./opal/mca/hwloc/base/.libs/hwloc_base_frame.o
./opal/mca/hwloc/.libs/libmca_hwloc.la
./opal/mca/hwloc/.libs/libmca_hwloc.a
./opal/mca/hwloc/libmca_hwloc.la
./opal/mca/hwloc/hwloc201
./opal/mca/hwloc/hwloc201/.deps/hwloc201_component.Plo
./opal/mca/hwloc/hwloc201/hwloc201_component.lo
./opal/mca/hwloc/hwloc201/hwloc
./opal/mca/hwloc/hwloc201/hwloc/include/hwloc
./opal/mca/hwloc/hwloc201/hwloc/hwloc
./opal/mca/hwloc/hwloc201/hwloc/hwloc/libhwloc_embedded.la
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_pci_la-topology-pci.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_gl_la-topology-gl.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_cuda_la-topology-cuda.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_xml_libxml_la-topology-xml-libxml.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_opencl_la-topology-opencl.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_nvml_la-topology-nvml.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.libs/libhwloc_embedded.la
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.libs/libhwloc_embedded.a
./opal/mca/hwloc/hwloc201/.libs/hwloc201_component.o
./opal/mca/hwloc/hwloc201/.libs/libmca_hwloc_hwloc201.la
./opal/mca/hwloc/hwloc201/.libs/libmca_hwloc_hwloc201.a
./opal/mca/hwloc/hwloc201/libmca_hwloc_hwloc201.la
./orte/mca/rtc/hwloc
./orte/mca/rtc/hwloc/rtc_hwloc.lo
./orte/mca/rtc/hwloc/.deps/rtc_hwloc.Plo
./orte/mca/rtc/hwloc/.deps/rtc_hwloc_component.Plo
./orte/mca/rtc/hwloc/mca_rtc_hwloc.la
./orte/mca/rtc/hwloc/.libs/mca_rtc_hwloc.so
./orte/mca/rtc/hwloc/.libs/mca_rtc_hwloc.la
./orte/mca/rtc/hwloc/.libs/rtc_hwloc.o
./orte/mca/rtc/hwloc/.libs/rtc_hwloc_component.o
./orte/mca/rtc/hwloc/.libs/mca_rtc_hwloc.soT
./orte/mca/rtc/hwloc/.libs/mca_rtc_hwloc.lai
./orte/mca/rtc/hwloc/rtc_hwloc_component.lo
loki openmpi-master-201809290304-73075b8-Linux.x86_64.64_gcc 112
And some files are available in the install directory.
loki openmpi-master_64_gcc 116 find . -name '*hwloc*'
./share/openmpi/help-orte-rtc-hwloc.txt
./share/openmpi/help-opal-hwloc-base.txt
./lib64/openmpi/mca_rtc_hwloc.so
./lib64/openmpi/mca_rtc_hwloc.la
loki openmpi-master_64_gcc 117
I don't see any unavailable libraries so that the only available
hwloc library should work.
loki openmpi 126 ldd -v mca_rtc_hwloc.so
linux-vdso.so.1 (0x00007ffd2df5b000)
libopen-rte.so.0 => /usr/local/openmpi-master_64_gcc/lib64/libopen-rte.so.0 (0x00007f082b7fb000)
libopen-pal.so.0 => /usr/local/openmpi-master_64_gcc/lib64/libopen-pal.so.0 (0x00007f082b493000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f082b28f000)
libudev.so.1 => /usr/lib64/libudev.so.1 (0x00007f082b06e000)
libpciaccess.so.0 => /usr/lib64/libpciaccess.so.0 (0x00007f082ae64000)
librt.so.1 => /lib64/librt.so.1 (0x00007f082ac5c000)
libm.so.6 => /lib64/libm.so.6 (0x00007f082a95f000)
libutil.so.1 => /lib64/libutil.so.1 (0x00007f082a75c000)
libz.so.1 => /lib64/libz.so.1 (0x00007f082a546000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f082a329000)
libc.so.6 => /lib64/libc.so.6 (0x00007f0829f84000)
libgcc_s.so.1 => /usr/local/gcc-8.2.0/lib64/libgcc_s.so.1 (0x00007f0829d6c000)
/lib64/ld-linux-x86-64.so.2 (0x00007f082bd24000)
libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f0829b46000)
libcap.so.2 => /lib64/libcap.so.2 (0x00007f0829941000)
libresolv.so.2 => /lib64/libresolv.so.2 (0x00007f082972a000)
libpcre.so.1 => /usr/lib64/libpcre.so.1 (0x00007f08294bb000)
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libz.so.1 (ZLIB_1.2.0) => /lib64/libz.so.1
libpthread.so.0 (GLIBC_2.3.2) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
librt.so.1 (GLIBC_2.2.5) => /lib64/librt.so.1
libgcc_s.so.1 (GCC_3.0) => /usr/local/gcc-8.2.0/lib64/libgcc_s.so.1
libgcc_s.so.1 (GCC_3.3.1) => /usr/local/gcc-8.2.0/lib64/libgcc_s.so.1
libdl.so.2 (GLIBC_2.2.5) => /lib64/libdl.so.2
libutil.so.1 (GLIBC_2.2.5) => /lib64/libutil.so.1
libudev.so.1 (LIBUDEV_183) => /usr/lib64/libudev.so.1
libm.so.6 (GLIBC_2.2.5) => /lib64/libm.so.6
libpthread.so.0 (GLIBC_2.3.4) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_2.3.2) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.6) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.7) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.2) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
librt.so.1 (GLIBC_2.2.5) => /lib64/librt.so.1
ld-linux-x86-64.so.2 (GLIBC_2.3) => /lib64/ld-linux-x86-64.so.2
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.9) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.16) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.8) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.7) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libpthread.so.0 (GLIBC_2.3.2) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_PRIVATE) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.2) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
ld-linux-x86-64.so.2 (GLIBC_2.2.5) => /lib64/ld-linux-x86-64.so.2
ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.2) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
ld-linux-x86-64.so.2 (GLIBC_2.3) => /lib64/ld-linux-x86-64.so.2
ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libdl.so.2 (GLIBC_2.2.5) => /lib64/libdl.so.2
ld-linux-x86-64.so.2 (GLIBC_2.3) => /lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.8) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.7) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.8) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
loki openmpi 127
Hopefully that helps to find the problem. I will answer your emails
tommorrow if you need anything else.
Best regards
Siegmar
Post by Ralph H Castain
configure:3383: === HWLOC
configure:36189: checking for hwloc in
configure:36201: result: Could not find internal/lib or internal/lib64
configure:36203: error: Can not continue
Can you confirm that HWLOC built? I believe we require it, but perhaps something is different about this environment.
Post by Ralph H Castain
Looks like PMIx failed to build - can you send the config.log?
Post by Siegmar Gross
Hi,
yesterday I've installed openmpi-v4.0.x-201809290241-a7e275c and
openmpi-master-201805080348-b39bbfb on my "SUSE Linux Enterprise Server
12.3 (x86_64)" with Sun C 5.15, gcc 6.4.0, Intel icc 18.0.3, and Portland
Group pgcc 18.4-0. Unfortunately, I get the following error for all seven
installed versions (Sun C couldn't built master as I mentioned in another
email).
loki hello_1 118 mpiexec -np 4 --host loki:2,nfs2:2 hello_1_mpi
[loki:11423] [[45859,0],0] ORTE_ERROR_LOG: Not found in file ../../../../../openmpi-v4.0.x-201809290241-a7e275c/orte/mca/ess/hnp/ess_hnp_module.c at line 321
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
opal_pmix_base_select failed
--> Returned value Not found (-13) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
loki hello_1 119
I would be grateful, if somebody can fix the problem. Do you need anything
else? Thank you very much for any help in advance.
Kind regards
Siegmar
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
--
Jeff Squyres
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Ralph H Castain
2018-10-05 09:45:28 UTC
Permalink
Please send Jeff and I the opal/mca/pmix/pmix4x/pmix/config.log again - we’ll need to see why it isn’t building. The patch definitely is not in the v4.0 branch, but it should have been in master.
Post by Siegmar Gross
Hi Ralph, hi Jeff,
Jeff and I talked and believe the patch in https://github.com/open-mpi/ompi/pull/5836 <https://github.com/open-mpi/ompi/pull/5836> should fix the problem.
Today I've installed openmpi-master-201810050304-5f1c940 and
openmpi-v4.0.x-201810050241-c079666. Unfortunately, I still get the
same error for all seven versions that I was able to build.
loki hello_1 114 mpicc --showme
gcc -I/usr/local/openmpi-master_64_gcc/include -fexceptions -pthread -std=c11 -m64 -Wl,-rpath -Wl,/usr/local/openmpi-master_64_gcc/lib64 -Wl,--enable-new-dtags -L/usr/local/openmpi-master_64_gcc/lib64 -lmpi
loki hello_1 115 ompi_info | grep "Open MPI repo revision"
Open MPI repo revision: v2.x-dev-6262-g5f1c940
loki hello_1 116 mpicc hello_1_mpi.c
loki hello_1 117 mpiexec -np 2 a.out
[loki:25575] [[64603,0],0] ORTE_ERROR_LOG: Not found in file ../../../../../openmpi-master-201810050304-5f1c940/orte/mca/ess/hnp/ess_hnp_module.c at line 320
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
opal_pmix_base_select failed
--> Returned value Not found (-13) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
loki hello_1 118
I don't know, if you have already applied your suggested patch or if the
error message is still from a version without that patch. Do you need
anything else?
Best regards
Siegmar
Post by Jeff Squyres (jsquyres) via users
(Ralph sent me Siegmar's pmix config.log, which Siegmar sent to him off-list)
It looks like Siegmar passed --with-hwloc=internal.
Open MPI's configure understood this and did the appropriate things.
PMIX's configure didn't.
I think we need to add an adjustment into the PMIx configure.m4 in OMPI...
Post by Ralph H Castain
Hi Siegmar
I honestly have no idea - for some reason, the PMIx component isn’t seeing the internal hwloc code in your environment.
Jeff, Brice - any ideas?
Hi Ralph,
how can I confirm that HWLOC built? Some hwloc files are available
in the built directory.
loki openmpi-master-201809290304-73075b8-Linux.x86_64.64_gcc 111 find . -name '*hwloc*'
./opal/mca/btl/usnic/.deps/btl_usnic_hwloc.Plo
./opal/mca/hwloc
./opal/mca/hwloc/external/.deps/hwloc_external_component.Plo
./opal/mca/hwloc/base/hwloc_base_frame.lo
./opal/mca/hwloc/base/.deps/hwloc_base_dt.Plo
./opal/mca/hwloc/base/.deps/hwloc_base_maffinity.Plo
./opal/mca/hwloc/base/.deps/hwloc_base_frame.Plo
./opal/mca/hwloc/base/.deps/hwloc_base_util.Plo
./opal/mca/hwloc/base/hwloc_base_dt.lo
./opal/mca/hwloc/base/hwloc_base_util.lo
./opal/mca/hwloc/base/hwloc_base_maffinity.lo
./opal/mca/hwloc/base/.libs/hwloc_base_util.o
./opal/mca/hwloc/base/.libs/hwloc_base_dt.o
./opal/mca/hwloc/base/.libs/hwloc_base_maffinity.o
./opal/mca/hwloc/base/.libs/hwloc_base_frame.o
./opal/mca/hwloc/.libs/libmca_hwloc.la
./opal/mca/hwloc/.libs/libmca_hwloc.a
./opal/mca/hwloc/libmca_hwloc.la
./opal/mca/hwloc/hwloc201
./opal/mca/hwloc/hwloc201/.deps/hwloc201_component.Plo
./opal/mca/hwloc/hwloc201/hwloc201_component.lo
./opal/mca/hwloc/hwloc201/hwloc
./opal/mca/hwloc/hwloc201/hwloc/include/hwloc
./opal/mca/hwloc/hwloc201/hwloc/hwloc
./opal/mca/hwloc/hwloc201/hwloc/hwloc/libhwloc_embedded.la
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_pci_la-topology-pci.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_gl_la-topology-gl.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_cuda_la-topology-cuda.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_xml_libxml_la-topology-xml-libxml.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_opencl_la-topology-opencl.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_nvml_la-topology-nvml.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.libs/libhwloc_embedded.la
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.libs/libhwloc_embedded.a
./opal/mca/hwloc/hwloc201/.libs/hwloc201_component.o
./opal/mca/hwloc/hwloc201/.libs/libmca_hwloc_hwloc201.la
./opal/mca/hwloc/hwloc201/.libs/libmca_hwloc_hwloc201.a
./opal/mca/hwloc/hwloc201/libmca_hwloc_hwloc201.la
./orte/mca/rtc/hwloc
./orte/mca/rtc/hwloc/rtc_hwloc.lo
./orte/mca/rtc/hwloc/.deps/rtc_hwloc.Plo
./orte/mca/rtc/hwloc/.deps/rtc_hwloc_component.Plo
./orte/mca/rtc/hwloc/mca_rtc_hwloc.la
./orte/mca/rtc/hwloc/.libs/mca_rtc_hwloc.so
./orte/mca/rtc/hwloc/.libs/mca_rtc_hwloc.la
./orte/mca/rtc/hwloc/.libs/rtc_hwloc.o
./orte/mca/rtc/hwloc/.libs/rtc_hwloc_component.o
./orte/mca/rtc/hwloc/.libs/mca_rtc_hwloc.soT
./orte/mca/rtc/hwloc/.libs/mca_rtc_hwloc.lai
./orte/mca/rtc/hwloc/rtc_hwloc_component.lo
loki openmpi-master-201809290304-73075b8-Linux.x86_64.64_gcc 112
And some files are available in the install directory.
loki openmpi-master_64_gcc 116 find . -name '*hwloc*'
./share/openmpi/help-orte-rtc-hwloc.txt
./share/openmpi/help-opal-hwloc-base.txt
./lib64/openmpi/mca_rtc_hwloc.so
./lib64/openmpi/mca_rtc_hwloc.la
loki openmpi-master_64_gcc 117
I don't see any unavailable libraries so that the only available
hwloc library should work.
loki openmpi 126 ldd -v mca_rtc_hwloc.so
linux-vdso.so.1 (0x00007ffd2df5b000)
libopen-rte.so.0 => /usr/local/openmpi-master_64_gcc/lib64/libopen-rte.so.0 (0x00007f082b7fb000)
libopen-pal.so.0 => /usr/local/openmpi-master_64_gcc/lib64/libopen-pal.so.0 (0x00007f082b493000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f082b28f000)
libudev.so.1 => /usr/lib64/libudev.so.1 (0x00007f082b06e000)
libpciaccess.so.0 => /usr/lib64/libpciaccess.so.0 (0x00007f082ae64000)
librt.so.1 => /lib64/librt.so.1 (0x00007f082ac5c000)
libm.so.6 => /lib64/libm.so.6 (0x00007f082a95f000)
libutil.so.1 => /lib64/libutil.so.1 (0x00007f082a75c000)
libz.so.1 => /lib64/libz.so.1 (0x00007f082a546000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f082a329000)
libc.so.6 => /lib64/libc.so.6 (0x00007f0829f84000)
libgcc_s.so.1 => /usr/local/gcc-8.2.0/lib64/libgcc_s.so.1 (0x00007f0829d6c000)
/lib64/ld-linux-x86-64.so.2 (0x00007f082bd24000)
libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f0829b46000)
libcap.so.2 => /lib64/libcap.so.2 (0x00007f0829941000)
libresolv.so.2 => /lib64/libresolv.so.2 (0x00007f082972a000)
libpcre.so.1 => /usr/lib64/libpcre.so.1 (0x00007f08294bb000)
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libz.so.1 (ZLIB_1.2.0) => /lib64/libz.so.1
libpthread.so.0 (GLIBC_2.3.2) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
librt.so.1 (GLIBC_2.2.5) => /lib64/librt.so.1
libgcc_s.so.1 (GCC_3.0) => /usr/local/gcc-8.2.0/lib64/libgcc_s.so.1
libgcc_s.so.1 (GCC_3.3.1) => /usr/local/gcc-8.2.0/lib64/libgcc_s.so.1
libdl.so.2 (GLIBC_2.2.5) => /lib64/libdl.so.2
libutil.so.1 (GLIBC_2.2.5) => /lib64/libutil.so.1
libudev.so.1 (LIBUDEV_183) => /usr/lib64/libudev.so.1
libm.so.6 (GLIBC_2.2.5) => /lib64/libm.so.6
libpthread.so.0 (GLIBC_2.3.4) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_2.3.2) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.6) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.7) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.2) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
librt.so.1 (GLIBC_2.2.5) => /lib64/librt.so.1
ld-linux-x86-64.so.2 (GLIBC_2.3) => /lib64/ld-linux-x86-64.so.2
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.9) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.16) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.8) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.7) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libpthread.so.0 (GLIBC_2.3.2) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_PRIVATE) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.2) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
ld-linux-x86-64.so.2 (GLIBC_2.2.5) => /lib64/ld-linux-x86-64.so.2
ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.2) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
ld-linux-x86-64.so.2 (GLIBC_2.3) => /lib64/ld-linux-x86-64.so.2
ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libdl.so.2 (GLIBC_2.2.5) => /lib64/libdl.so.2
ld-linux-x86-64.so.2 (GLIBC_2.3) => /lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.8) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.7) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.8) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
loki openmpi 127
Hopefully that helps to find the problem. I will answer your emails
tommorrow if you need anything else.
Best regards
Siegmar
Post by Ralph H Castain
configure:3383: === HWLOC
configure:36189: checking for hwloc in
configure:36201: result: Could not find internal/lib or internal/lib64
configure:36203: error: Can not continue
Can you confirm that HWLOC built? I believe we require it, but perhaps something is different about this environment.
Post by Ralph H Castain
Looks like PMIx failed to build - can you send the config.log?
Post by Siegmar Gross
Hi,
yesterday I've installed openmpi-v4.0.x-201809290241-a7e275c and
openmpi-master-201805080348-b39bbfb on my "SUSE Linux Enterprise Server
12.3 (x86_64)" with Sun C 5.15, gcc 6.4.0, Intel icc 18.0.3, and Portland
Group pgcc 18.4-0. Unfortunately, I get the following error for all seven
installed versions (Sun C couldn't built master as I mentioned in another
email).
loki hello_1 118 mpiexec -np 4 --host loki:2,nfs2:2 hello_1_mpi
[loki:11423] [[45859,0],0] ORTE_ERROR_LOG: Not found in file ../../../../../openmpi-v4.0.x-201809290241-a7e275c/orte/mca/ess/hnp/ess_hnp_module.c at line 321
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
opal_pmix_base_select failed
--> Returned value Not found (-13) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
loki hello_1 119
I would be grateful, if somebody can fix the problem. Do you need anything
else? Thank you very much for any help in advance.
Kind regards
Siegmar
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
--
Jeff Squyres
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users <https://lists.open-mpi.org/mailman/listinfo/users>
Jeff Squyres (jsquyres) via users
2018-10-05 14:33:55 UTC
Permalink
Oops! We had a typo in yesterday's fix -- fixed:

https://github.com/open-mpi/ompi/pull/5847

Ralph also put double extra super protection to make triple sure that this error can't happen again in:

https://github.com/open-mpi/ompi/pull/5846

Both of these should be in tonight's nightly snapshot.

Thank you!
Please send Jeff and I the opal/mca/pmix/pmix4x/pmix/config.log again - we’ll need to see why it isn’t building. The patch definitely is not in the v4.0 branch, but it should have been in master.
Post by Siegmar Gross
Hi Ralph, hi Jeff,
Post by Ralph H Castain
Jeff and I talked and believe the patch in https://github.com/open-mpi/ompi/pull/5836 should fix the problem.
Today I've installed openmpi-master-201810050304-5f1c940 and
openmpi-v4.0.x-201810050241-c079666. Unfortunately, I still get the
same error for all seven versions that I was able to build.
loki hello_1 114 mpicc --showme
gcc -I/usr/local/openmpi-master_64_gcc/include -fexceptions -pthread -std=c11 -m64 -Wl,-rpath -Wl,/usr/local/openmpi-master_64_gcc/lib64 -Wl,--enable-new-dtags -L/usr/local/openmpi-master_64_gcc/lib64 -lmpi
loki hello_1 115 ompi_info | grep "Open MPI repo revision"
Open MPI repo revision: v2.x-dev-6262-g5f1c940
loki hello_1 116 mpicc hello_1_mpi.c
loki hello_1 117 mpiexec -np 2 a.out
[loki:25575] [[64603,0],0] ORTE_ERROR_LOG: Not found in file ../../../../../openmpi-master-201810050304-5f1c940/orte/mca/ess/hnp/ess_hnp_module.c at line 320
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
opal_pmix_base_select failed
--> Returned value Not found (-13) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
loki hello_1 118
I don't know, if you have already applied your suggested patch or if the
error message is still from a version without that patch. Do you need
anything else?
Best regards
Siegmar
Post by Ralph H Castain
Post by Jeff Squyres (jsquyres) via users
(Ralph sent me Siegmar's pmix config.log, which Siegmar sent to him off-list)
It looks like Siegmar passed --with-hwloc=internal.
Open MPI's configure understood this and did the appropriate things.
PMIX's configure didn't.
I think we need to add an adjustment into the PMIx configure.m4 in OMPI...
Post by Ralph H Castain
Hi Siegmar
I honestly have no idea - for some reason, the PMIx component isn’t seeing the internal hwloc code in your environment.
Jeff, Brice - any ideas?
Hi Ralph,
how can I confirm that HWLOC built? Some hwloc files are available
in the built directory.
loki openmpi-master-201809290304-73075b8-Linux.x86_64.64_gcc 111 find . -name '*hwloc*'
./opal/mca/btl/usnic/.deps/btl_usnic_hwloc.Plo
./opal/mca/hwloc
./opal/mca/hwloc/external/.deps/hwloc_external_component.Plo
./opal/mca/hwloc/base/hwloc_base_frame.lo
./opal/mca/hwloc/base/.deps/hwloc_base_dt.Plo
./opal/mca/hwloc/base/.deps/hwloc_base_maffinity.Plo
./opal/mca/hwloc/base/.deps/hwloc_base_frame.Plo
./opal/mca/hwloc/base/.deps/hwloc_base_util.Plo
./opal/mca/hwloc/base/hwloc_base_dt.lo
./opal/mca/hwloc/base/hwloc_base_util.lo
./opal/mca/hwloc/base/hwloc_base_maffinity.lo
./opal/mca/hwloc/base/.libs/hwloc_base_util.o
./opal/mca/hwloc/base/.libs/hwloc_base_dt.o
./opal/mca/hwloc/base/.libs/hwloc_base_maffinity.o
./opal/mca/hwloc/base/.libs/hwloc_base_frame.o
./opal/mca/hwloc/.libs/libmca_hwloc.la
./opal/mca/hwloc/.libs/libmca_hwloc.a
./opal/mca/hwloc/libmca_hwloc.la
./opal/mca/hwloc/hwloc201
./opal/mca/hwloc/hwloc201/.deps/hwloc201_component.Plo
./opal/mca/hwloc/hwloc201/hwloc201_component.lo
./opal/mca/hwloc/hwloc201/hwloc
./opal/mca/hwloc/hwloc201/hwloc/include/hwloc
./opal/mca/hwloc/hwloc201/hwloc/hwloc
./opal/mca/hwloc/hwloc201/hwloc/hwloc/libhwloc_embedded.la
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_pci_la-topology-pci.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_gl_la-topology-gl.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_cuda_la-topology-cuda.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_xml_libxml_la-topology-xml-libxml.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_opencl_la-topology-opencl.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_nvml_la-topology-nvml.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.libs/libhwloc_embedded.la
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.libs/libhwloc_embedded.a
./opal/mca/hwloc/hwloc201/.libs/hwloc201_component.o
./opal/mca/hwloc/hwloc201/.libs/libmca_hwloc_hwloc201.la
./opal/mca/hwloc/hwloc201/.libs/libmca_hwloc_hwloc201.a
./opal/mca/hwloc/hwloc201/libmca_hwloc_hwloc201.la
./orte/mca/rtc/hwloc
./orte/mca/rtc/hwloc/rtc_hwloc.lo
./orte/mca/rtc/hwloc/.deps/rtc_hwloc.Plo
./orte/mca/rtc/hwloc/.deps/rtc_hwloc_component.Plo
./orte/mca/rtc/hwloc/mca_rtc_hwloc.la
./orte/mca/rtc/hwloc/.libs/mca_rtc_hwloc.so
./orte/mca/rtc/hwloc/.libs/mca_rtc_hwloc.la
./orte/mca/rtc/hwloc/.libs/rtc_hwloc.o
./orte/mca/rtc/hwloc/.libs/rtc_hwloc_component.o
./orte/mca/rtc/hwloc/.libs/mca_rtc_hwloc.soT
./orte/mca/rtc/hwloc/.libs/mca_rtc_hwloc.lai
./orte/mca/rtc/hwloc/rtc_hwloc_component.lo
loki openmpi-master-201809290304-73075b8-Linux.x86_64.64_gcc 112
And some files are available in the install directory.
loki openmpi-master_64_gcc 116 find . -name '*hwloc*'
./share/openmpi/help-orte-rtc-hwloc.txt
./share/openmpi/help-opal-hwloc-base.txt
./lib64/openmpi/mca_rtc_hwloc.so
./lib64/openmpi/mca_rtc_hwloc.la
loki openmpi-master_64_gcc 117
I don't see any unavailable libraries so that the only available
hwloc library should work.
loki openmpi 126 ldd -v mca_rtc_hwloc.so
linux-vdso.so.1 (0x00007ffd2df5b000)
libopen-rte.so.0 => /usr/local/openmpi-master_64_gcc/lib64/libopen-rte.so.0 (0x00007f082b7fb000)
libopen-pal.so.0 => /usr/local/openmpi-master_64_gcc/lib64/libopen-pal.so.0 (0x00007f082b493000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f082b28f000)
libudev.so.1 => /usr/lib64/libudev.so.1 (0x00007f082b06e000)
libpciaccess.so.0 => /usr/lib64/libpciaccess.so.0 (0x00007f082ae64000)
librt.so.1 => /lib64/librt.so.1 (0x00007f082ac5c000)
libm.so.6 => /lib64/libm.so.6 (0x00007f082a95f000)
libutil.so.1 => /lib64/libutil.so.1 (0x00007f082a75c000)
libz.so.1 => /lib64/libz.so.1 (0x00007f082a546000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f082a329000)
libc.so.6 => /lib64/libc.so.6 (0x00007f0829f84000)
libgcc_s.so.1 => /usr/local/gcc-8.2.0/lib64/libgcc_s.so.1 (0x00007f0829d6c000)
/lib64/ld-linux-x86-64.so.2 (0x00007f082bd24000)
libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f0829b46000)
libcap.so.2 => /lib64/libcap.so.2 (0x00007f0829941000)
libresolv.so.2 => /lib64/libresolv.so.2 (0x00007f082972a000)
libpcre.so.1 => /usr/lib64/libpcre.so.1 (0x00007f08294bb000)
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libz.so.1 (ZLIB_1.2.0) => /lib64/libz.so.1
libpthread.so.0 (GLIBC_2.3.2) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
librt.so.1 (GLIBC_2.2.5) => /lib64/librt.so.1
libgcc_s.so.1 (GCC_3.0) => /usr/local/gcc-8.2.0/lib64/libgcc_s.so.1
libgcc_s.so.1 (GCC_3.3.1) => /usr/local/gcc-8.2.0/lib64/libgcc_s.so.1
libdl.so.2 (GLIBC_2.2.5) => /lib64/libdl.so.2
libutil.so.1 (GLIBC_2.2.5) => /lib64/libutil.so.1
libudev.so.1 (LIBUDEV_183) => /usr/lib64/libudev.so.1
libm.so.6 (GLIBC_2.2.5) => /lib64/libm.so.6
libpthread.so.0 (GLIBC_2.3.4) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_2.3.2) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.6) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.7) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.2) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
librt.so.1 (GLIBC_2.2.5) => /lib64/librt.so.1
ld-linux-x86-64.so.2 (GLIBC_2.3) => /lib64/ld-linux-x86-64.so.2
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.9) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.16) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.8) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.7) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libpthread.so.0 (GLIBC_2.3.2) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_PRIVATE) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.2) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
ld-linux-x86-64.so.2 (GLIBC_2.2.5) => /lib64/ld-linux-x86-64.so.2
ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.2) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
ld-linux-x86-64.so.2 (GLIBC_2.3) => /lib64/ld-linux-x86-64.so.2
ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libdl.so.2 (GLIBC_2.2.5) => /lib64/libdl.so.2
ld-linux-x86-64.so.2 (GLIBC_2.3) => /lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.8) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.7) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.8) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
loki openmpi 127
Hopefully that helps to find the problem. I will answer your emails
tommorrow if you need anything else.
Best regards
Siegmar
Post by Ralph H Castain
configure:3383: === HWLOC
configure:36189: checking for hwloc in
configure:36201: result: Could not find internal/lib or internal/lib64
configure:36203: error: Can not continue
Can you confirm that HWLOC built? I believe we require it, but perhaps something is different about this environment.
Post by Ralph H Castain
Looks like PMIx failed to build - can you send the config.log?
Post by Siegmar Gross
Hi,
yesterday I've installed openmpi-v4.0.x-201809290241-a7e275c and
openmpi-master-201805080348-b39bbfb on my "SUSE Linux Enterprise Server
12.3 (x86_64)" with Sun C 5.15, gcc 6.4.0, Intel icc 18.0.3, and Portland
Group pgcc 18.4-0. Unfortunately, I get the following error for all seven
installed versions (Sun C couldn't built master as I mentioned in another
email).
loki hello_1 118 mpiexec -np 4 --host loki:2,nfs2:2 hello_1_mpi
[loki:11423] [[45859,0],0] ORTE_ERROR_LOG: Not found in file ../../../../../openmpi-v4.0.x-201809290241-a7e275c/orte/mca/ess/hnp/ess_hnp_module.c at line 321
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
opal_pmix_base_select failed
--> Returned value Not found (-13) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
loki hello_1 119
I would be grateful, if somebody can fix the problem. Do you need anything
else? Thank you very much for any help in advance.
Kind regards
Siegmar
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
--
Jeff Squyres
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
--
Jeff Squyres
***@cisco.com
Siegmar Gross
2018-10-06 09:12:02 UTC
Permalink
Hi Jeff, hi Ralph,

Great, it works again! Thank you very much for your help. I'm really happy,
if the undefined references for Sun C are resolved and there are no new
problems for that compiler :-)). Do you know when the pmix patch will be
integrated into version 4.0.0?


Best regards

Siegmar
Post by Jeff Squyres (jsquyres) via users
https://github.com/open-mpi/ompi/pull/5847
https://github.com/open-mpi/ompi/pull/5846
Both of these should be in tonight's nightly snapshot.
Thank you!
Please send Jeff and I the opal/mca/pmix/pmix4x/pmix/config.log again - we’ll need to see why it isn’t building. The patch definitely is not in the v4.0 branch, but it should have been in master.
Post by Siegmar Gross
Hi Ralph, hi Jeff,
Post by Ralph H Castain
Jeff and I talked and believe the patch in https://github.com/open-mpi/ompi/pull/5836 should fix the problem.
Today I've installed openmpi-master-201810050304-5f1c940 and
openmpi-v4.0.x-201810050241-c079666. Unfortunately, I still get the
same error for all seven versions that I was able to build.
loki hello_1 114 mpicc --showme
gcc -I/usr/local/openmpi-master_64_gcc/include -fexceptions -pthread -std=c11 -m64 -Wl,-rpath -Wl,/usr/local/openmpi-master_64_gcc/lib64 -Wl,--enable-new-dtags -L/usr/local/openmpi-master_64_gcc/lib64 -lmpi
loki hello_1 115 ompi_info | grep "Open MPI repo revision"
Open MPI repo revision: v2.x-dev-6262-g5f1c940
loki hello_1 116 mpicc hello_1_mpi.c
loki hello_1 117 mpiexec -np 2 a.out
[loki:25575] [[64603,0],0] ORTE_ERROR_LOG: Not found in file ../../../../../openmpi-master-201810050304-5f1c940/orte/mca/ess/hnp/ess_hnp_module.c at line 320
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
opal_pmix_base_select failed
--> Returned value Not found (-13) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
loki hello_1 118
I don't know, if you have already applied your suggested patch or if the
error message is still from a version without that patch. Do you need
anything else?
Best regards
Siegmar
Post by Ralph H Castain
Post by Jeff Squyres (jsquyres) via users
(Ralph sent me Siegmar's pmix config.log, which Siegmar sent to him off-list)
It looks like Siegmar passed --with-hwloc=internal.
Open MPI's configure understood this and did the appropriate things.
PMIX's configure didn't.
I think we need to add an adjustment into the PMIx configure.m4 in OMPI...
Post by Ralph H Castain
Hi Siegmar
I honestly have no idea - for some reason, the PMIx component isn’t seeing the internal hwloc code in your environment.
Jeff, Brice - any ideas?
Hi Ralph,
how can I confirm that HWLOC built? Some hwloc files are available
in the built directory.
loki openmpi-master-201809290304-73075b8-Linux.x86_64.64_gcc 111 find . -name '*hwloc*'
./opal/mca/btl/usnic/.deps/btl_usnic_hwloc.Plo
./opal/mca/hwloc
./opal/mca/hwloc/external/.deps/hwloc_external_component.Plo
./opal/mca/hwloc/base/hwloc_base_frame.lo
./opal/mca/hwloc/base/.deps/hwloc_base_dt.Plo
./opal/mca/hwloc/base/.deps/hwloc_base_maffinity.Plo
./opal/mca/hwloc/base/.deps/hwloc_base_frame.Plo
./opal/mca/hwloc/base/.deps/hwloc_base_util.Plo
./opal/mca/hwloc/base/hwloc_base_dt.lo
./opal/mca/hwloc/base/hwloc_base_util.lo
./opal/mca/hwloc/base/hwloc_base_maffinity.lo
./opal/mca/hwloc/base/.libs/hwloc_base_util.o
./opal/mca/hwloc/base/.libs/hwloc_base_dt.o
./opal/mca/hwloc/base/.libs/hwloc_base_maffinity.o
./opal/mca/hwloc/base/.libs/hwloc_base_frame.o
./opal/mca/hwloc/.libs/libmca_hwloc.la
./opal/mca/hwloc/.libs/libmca_hwloc.a
./opal/mca/hwloc/libmca_hwloc.la
./opal/mca/hwloc/hwloc201
./opal/mca/hwloc/hwloc201/.deps/hwloc201_component.Plo
./opal/mca/hwloc/hwloc201/hwloc201_component.lo
./opal/mca/hwloc/hwloc201/hwloc
./opal/mca/hwloc/hwloc201/hwloc/include/hwloc
./opal/mca/hwloc/hwloc201/hwloc/hwloc
./opal/mca/hwloc/hwloc201/hwloc/hwloc/libhwloc_embedded.la
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_pci_la-topology-pci.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_gl_la-topology-gl.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_cuda_la-topology-cuda.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_xml_libxml_la-topology-xml-libxml.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_opencl_la-topology-opencl.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_nvml_la-topology-nvml.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.libs/libhwloc_embedded.la
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.libs/libhwloc_embedded.a
./opal/mca/hwloc/hwloc201/.libs/hwloc201_component.o
./opal/mca/hwloc/hwloc201/.libs/libmca_hwloc_hwloc201.la
./opal/mca/hwloc/hwloc201/.libs/libmca_hwloc_hwloc201.a
./opal/mca/hwloc/hwloc201/libmca_hwloc_hwloc201.la
./orte/mca/rtc/hwloc
./orte/mca/rtc/hwloc/rtc_hwloc.lo
./orte/mca/rtc/hwloc/.deps/rtc_hwloc.Plo
./orte/mca/rtc/hwloc/.deps/rtc_hwloc_component.Plo
./orte/mca/rtc/hwloc/mca_rtc_hwloc.la
./orte/mca/rtc/hwloc/.libs/mca_rtc_hwloc.so
./orte/mca/rtc/hwloc/.libs/mca_rtc_hwloc.la
./orte/mca/rtc/hwloc/.libs/rtc_hwloc.o
./orte/mca/rtc/hwloc/.libs/rtc_hwloc_component.o
./orte/mca/rtc/hwloc/.libs/mca_rtc_hwloc.soT
./orte/mca/rtc/hwloc/.libs/mca_rtc_hwloc.lai
./orte/mca/rtc/hwloc/rtc_hwloc_component.lo
loki openmpi-master-201809290304-73075b8-Linux.x86_64.64_gcc 112
And some files are available in the install directory.
loki openmpi-master_64_gcc 116 find . -name '*hwloc*'
./share/openmpi/help-orte-rtc-hwloc.txt
./share/openmpi/help-opal-hwloc-base.txt
./lib64/openmpi/mca_rtc_hwloc.so
./lib64/openmpi/mca_rtc_hwloc.la
loki openmpi-master_64_gcc 117
I don't see any unavailable libraries so that the only available
hwloc library should work.
loki openmpi 126 ldd -v mca_rtc_hwloc.so
linux-vdso.so.1 (0x00007ffd2df5b000)
libopen-rte.so.0 => /usr/local/openmpi-master_64_gcc/lib64/libopen-rte.so.0 (0x00007f082b7fb000)
libopen-pal.so.0 => /usr/local/openmpi-master_64_gcc/lib64/libopen-pal.so.0 (0x00007f082b493000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f082b28f000)
libudev.so.1 => /usr/lib64/libudev.so.1 (0x00007f082b06e000)
libpciaccess.so.0 => /usr/lib64/libpciaccess.so.0 (0x00007f082ae64000)
librt.so.1 => /lib64/librt.so.1 (0x00007f082ac5c000)
libm.so.6 => /lib64/libm.so.6 (0x00007f082a95f000)
libutil.so.1 => /lib64/libutil.so.1 (0x00007f082a75c000)
libz.so.1 => /lib64/libz.so.1 (0x00007f082a546000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f082a329000)
libc.so.6 => /lib64/libc.so.6 (0x00007f0829f84000)
libgcc_s.so.1 => /usr/local/gcc-8.2.0/lib64/libgcc_s.so.1 (0x00007f0829d6c000)
/lib64/ld-linux-x86-64.so.2 (0x00007f082bd24000)
libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f0829b46000)
libcap.so.2 => /lib64/libcap.so.2 (0x00007f0829941000)
libresolv.so.2 => /lib64/libresolv.so.2 (0x00007f082972a000)
libpcre.so.1 => /usr/lib64/libpcre.so.1 (0x00007f08294bb000)
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libz.so.1 (ZLIB_1.2.0) => /lib64/libz.so.1
libpthread.so.0 (GLIBC_2.3.2) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
librt.so.1 (GLIBC_2.2.5) => /lib64/librt.so.1
libgcc_s.so.1 (GCC_3.0) => /usr/local/gcc-8.2.0/lib64/libgcc_s.so.1
libgcc_s.so.1 (GCC_3.3.1) => /usr/local/gcc-8.2.0/lib64/libgcc_s.so.1
libdl.so.2 (GLIBC_2.2.5) => /lib64/libdl.so.2
libutil.so.1 (GLIBC_2.2.5) => /lib64/libutil.so.1
libudev.so.1 (LIBUDEV_183) => /usr/lib64/libudev.so.1
libm.so.6 (GLIBC_2.2.5) => /lib64/libm.so.6
libpthread.so.0 (GLIBC_2.3.4) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_2.3.2) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.6) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.7) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.2) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
librt.so.1 (GLIBC_2.2.5) => /lib64/librt.so.1
ld-linux-x86-64.so.2 (GLIBC_2.3) => /lib64/ld-linux-x86-64.so.2
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.9) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.16) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.8) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.7) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libpthread.so.0 (GLIBC_2.3.2) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_PRIVATE) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.2) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
ld-linux-x86-64.so.2 (GLIBC_2.2.5) => /lib64/ld-linux-x86-64.so.2
ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.2) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
ld-linux-x86-64.so.2 (GLIBC_2.3) => /lib64/ld-linux-x86-64.so.2
ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libdl.so.2 (GLIBC_2.2.5) => /lib64/libdl.so.2
ld-linux-x86-64.so.2 (GLIBC_2.3) => /lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.8) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.7) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.8) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
loki openmpi 127
Hopefully that helps to find the problem. I will answer your emails
tommorrow if you need anything else.
Best regards
Siegmar
Post by Ralph H Castain
configure:3383: === HWLOC
configure:36189: checking for hwloc in
configure:36201: result: Could not find internal/lib or internal/lib64
configure:36203: error: Can not continue
Can you confirm that HWLOC built? I believe we require it, but perhaps something is different about this environment.
Post by Ralph H Castain
Looks like PMIx failed to build - can you send the config.log?
Post by Siegmar Gross
Hi,
yesterday I've installed openmpi-v4.0.x-201809290241-a7e275c and
openmpi-master-201805080348-b39bbfb on my "SUSE Linux Enterprise Server
12.3 (x86_64)" with Sun C 5.15, gcc 6.4.0, Intel icc 18.0.3, and Portland
Group pgcc 18.4-0. Unfortunately, I get the following error for all seven
installed versions (Sun C couldn't built master as I mentioned in another
email).
loki hello_1 118 mpiexec -np 4 --host loki:2,nfs2:2 hello_1_mpi
[loki:11423] [[45859,0],0] ORTE_ERROR_LOG: Not found in file ../../../../../openmpi-v4.0.x-201809290241-a7e275c/orte/mca/ess/hnp/ess_hnp_module.c at line 321
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
opal_pmix_base_select failed
--> Returned value Not found (-13) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
loki hello_1 119
I would be grateful, if somebody can fix the problem. Do you need anything
else? Thank you very much for any help in advance.
Kind regards
Siegmar
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
--
Jeff Squyres
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Ralph H Castain
2018-10-12 13:19:46 UTC
Permalink
Hi Siegmar

The patch was merged into the v4.0.0 branch on Oct 10th, so should be available in the nightly tarball from that date onward.
Post by Siegmar Gross
Hi Jeff, hi Ralph,
Great, it works again! Thank you very much for your help. I'm really happy,
if the undefined references for Sun C are resolved and there are no new
problems for that compiler :-)). Do you know when the pmix patch will be
integrated into version 4.0.0?
Best regards
Siegmar
Post by Jeff Squyres (jsquyres) via users
https://github.com/open-mpi/ompi/pull/5847
https://github.com/open-mpi/ompi/pull/5846
Both of these should be in tonight's nightly snapshot.
Thank you!
Please send Jeff and I the opal/mca/pmix/pmix4x/pmix/config.log again - we’ll need to see why it isn’t building. The patch definitely is not in the v4.0 branch, but it should have been in master.
Post by Siegmar Gross
Hi Ralph, hi Jeff,
Post by Ralph H Castain
Jeff and I talked and believe the patch in https://github.com/open-mpi/ompi/pull/5836 should fix the problem.
Today I've installed openmpi-master-201810050304-5f1c940 and
openmpi-v4.0.x-201810050241-c079666. Unfortunately, I still get the
same error for all seven versions that I was able to build.
loki hello_1 114 mpicc --showme
gcc -I/usr/local/openmpi-master_64_gcc/include -fexceptions -pthread -std=c11 -m64 -Wl,-rpath -Wl,/usr/local/openmpi-master_64_gcc/lib64 -Wl,--enable-new-dtags -L/usr/local/openmpi-master_64_gcc/lib64 -lmpi
loki hello_1 115 ompi_info | grep "Open MPI repo revision"
Open MPI repo revision: v2.x-dev-6262-g5f1c940
loki hello_1 116 mpicc hello_1_mpi.c
loki hello_1 117 mpiexec -np 2 a.out
[loki:25575] [[64603,0],0] ORTE_ERROR_LOG: Not found in file ../../../../../openmpi-master-201810050304-5f1c940/orte/mca/ess/hnp/ess_hnp_module.c at line 320
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
opal_pmix_base_select failed
--> Returned value Not found (-13) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
loki hello_1 118
I don't know, if you have already applied your suggested patch or if the
error message is still from a version without that patch. Do you need
anything else?
Best regards
Siegmar
Post by Ralph H Castain
Post by Jeff Squyres (jsquyres) via users
(Ralph sent me Siegmar's pmix config.log, which Siegmar sent to him off-list)
It looks like Siegmar passed --with-hwloc=internal.
Open MPI's configure understood this and did the appropriate things.
PMIX's configure didn't.
I think we need to add an adjustment into the PMIx configure.m4 in OMPI...
Post by Ralph H Castain
Hi Siegmar
I honestly have no idea - for some reason, the PMIx component isn’t seeing the internal hwloc code in your environment.
Jeff, Brice - any ideas?
Hi Ralph,
how can I confirm that HWLOC built? Some hwloc files are available
in the built directory.
loki openmpi-master-201809290304-73075b8-Linux.x86_64.64_gcc 111 find . -name '*hwloc*'
./opal/mca/btl/usnic/.deps/btl_usnic_hwloc.Plo
./opal/mca/hwloc
./opal/mca/hwloc/external/.deps/hwloc_external_component.Plo
./opal/mca/hwloc/base/hwloc_base_frame.lo
./opal/mca/hwloc/base/.deps/hwloc_base_dt.Plo
./opal/mca/hwloc/base/.deps/hwloc_base_maffinity.Plo
./opal/mca/hwloc/base/.deps/hwloc_base_frame.Plo
./opal/mca/hwloc/base/.deps/hwloc_base_util.Plo
./opal/mca/hwloc/base/hwloc_base_dt.lo
./opal/mca/hwloc/base/hwloc_base_util.lo
./opal/mca/hwloc/base/hwloc_base_maffinity.lo
./opal/mca/hwloc/base/.libs/hwloc_base_util.o
./opal/mca/hwloc/base/.libs/hwloc_base_dt.o
./opal/mca/hwloc/base/.libs/hwloc_base_maffinity.o
./opal/mca/hwloc/base/.libs/hwloc_base_frame.o
./opal/mca/hwloc/.libs/libmca_hwloc.la
./opal/mca/hwloc/.libs/libmca_hwloc.a
./opal/mca/hwloc/libmca_hwloc.la
./opal/mca/hwloc/hwloc201
./opal/mca/hwloc/hwloc201/.deps/hwloc201_component.Plo
./opal/mca/hwloc/hwloc201/hwloc201_component.lo
./opal/mca/hwloc/hwloc201/hwloc
./opal/mca/hwloc/hwloc201/hwloc/include/hwloc
./opal/mca/hwloc/hwloc201/hwloc/hwloc
./opal/mca/hwloc/hwloc201/hwloc/hwloc/libhwloc_embedded.la
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_pci_la-topology-pci.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_gl_la-topology-gl.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_cuda_la-topology-cuda.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_xml_libxml_la-topology-xml-libxml.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_opencl_la-topology-opencl.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.deps/hwloc_nvml_la-topology-nvml.Plo
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.libs/libhwloc_embedded.la
./opal/mca/hwloc/hwloc201/hwloc/hwloc/.libs/libhwloc_embedded.a
./opal/mca/hwloc/hwloc201/.libs/hwloc201_component.o
./opal/mca/hwloc/hwloc201/.libs/libmca_hwloc_hwloc201.la
./opal/mca/hwloc/hwloc201/.libs/libmca_hwloc_hwloc201.a
./opal/mca/hwloc/hwloc201/libmca_hwloc_hwloc201.la
./orte/mca/rtc/hwloc
./orte/mca/rtc/hwloc/rtc_hwloc.lo
./orte/mca/rtc/hwloc/.deps/rtc_hwloc.Plo
./orte/mca/rtc/hwloc/.deps/rtc_hwloc_component.Plo
./orte/mca/rtc/hwloc/mca_rtc_hwloc.la
./orte/mca/rtc/hwloc/.libs/mca_rtc_hwloc.so
./orte/mca/rtc/hwloc/.libs/mca_rtc_hwloc.la
./orte/mca/rtc/hwloc/.libs/rtc_hwloc.o
./orte/mca/rtc/hwloc/.libs/rtc_hwloc_component.o
./orte/mca/rtc/hwloc/.libs/mca_rtc_hwloc.soT
./orte/mca/rtc/hwloc/.libs/mca_rtc_hwloc.lai
./orte/mca/rtc/hwloc/rtc_hwloc_component.lo
loki openmpi-master-201809290304-73075b8-Linux.x86_64.64_gcc 112
And some files are available in the install directory.
loki openmpi-master_64_gcc 116 find . -name '*hwloc*'
./share/openmpi/help-orte-rtc-hwloc.txt
./share/openmpi/help-opal-hwloc-base.txt
./lib64/openmpi/mca_rtc_hwloc.so
./lib64/openmpi/mca_rtc_hwloc.la
loki openmpi-master_64_gcc 117
I don't see any unavailable libraries so that the only available
hwloc library should work.
loki openmpi 126 ldd -v mca_rtc_hwloc.so
linux-vdso.so.1 (0x00007ffd2df5b000)
libopen-rte.so.0 => /usr/local/openmpi-master_64_gcc/lib64/libopen-rte.so.0 (0x00007f082b7fb000)
libopen-pal.so.0 => /usr/local/openmpi-master_64_gcc/lib64/libopen-pal.so.0 (0x00007f082b493000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f082b28f000)
libudev.so.1 => /usr/lib64/libudev.so.1 (0x00007f082b06e000)
libpciaccess.so.0 => /usr/lib64/libpciaccess.so.0 (0x00007f082ae64000)
librt.so.1 => /lib64/librt.so.1 (0x00007f082ac5c000)
libm.so.6 => /lib64/libm.so.6 (0x00007f082a95f000)
libutil.so.1 => /lib64/libutil.so.1 (0x00007f082a75c000)
libz.so.1 => /lib64/libz.so.1 (0x00007f082a546000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f082a329000)
libc.so.6 => /lib64/libc.so.6 (0x00007f0829f84000)
libgcc_s.so.1 => /usr/local/gcc-8.2.0/lib64/libgcc_s.so.1 (0x00007f0829d6c000)
/lib64/ld-linux-x86-64.so.2 (0x00007f082bd24000)
libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f0829b46000)
libcap.so.2 => /lib64/libcap.so.2 (0x00007f0829941000)
libresolv.so.2 => /lib64/libresolv.so.2 (0x00007f082972a000)
libpcre.so.1 => /usr/lib64/libpcre.so.1 (0x00007f08294bb000)
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libz.so.1 (ZLIB_1.2.0) => /lib64/libz.so.1
libpthread.so.0 (GLIBC_2.3.2) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
librt.so.1 (GLIBC_2.2.5) => /lib64/librt.so.1
libgcc_s.so.1 (GCC_3.0) => /usr/local/gcc-8.2.0/lib64/libgcc_s.so.1
libgcc_s.so.1 (GCC_3.3.1) => /usr/local/gcc-8.2.0/lib64/libgcc_s.so.1
libdl.so.2 (GLIBC_2.2.5) => /lib64/libdl.so.2
libutil.so.1 (GLIBC_2.2.5) => /lib64/libutil.so.1
libudev.so.1 (LIBUDEV_183) => /usr/lib64/libudev.so.1
libm.so.6 (GLIBC_2.2.5) => /lib64/libm.so.6
libpthread.so.0 (GLIBC_2.3.4) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_2.3.2) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.6) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.7) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.2) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
librt.so.1 (GLIBC_2.2.5) => /lib64/librt.so.1
ld-linux-x86-64.so.2 (GLIBC_2.3) => /lib64/ld-linux-x86-64.so.2
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.9) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.16) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.8) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.7) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libpthread.so.0 (GLIBC_2.3.2) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_PRIVATE) => /lib64/libpthread.so.0
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.2) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
ld-linux-x86-64.so.2 (GLIBC_2.2.5) => /lib64/ld-linux-x86-64.so.2
ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.2) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
ld-linux-x86-64.so.2 (GLIBC_2.3) => /lib64/ld-linux-x86-64.so.2
ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libdl.so.2 (GLIBC_2.2.5) => /lib64/libdl.so.2
ld-linux-x86-64.so.2 (GLIBC_2.3) => /lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.8) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.7) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.8) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
libpthread.so.0 (GLIBC_2.2.5) => /lib64/libpthread.so.0
libc.so.6 (GLIBC_2.14) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.4) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
loki openmpi 127
Hopefully that helps to find the problem. I will answer your emails
tommorrow if you need anything else.
Best regards
Siegmar
Post by Ralph H Castain
configure:3383: === HWLOC
configure:36189: checking for hwloc in
configure:36201: result: Could not find internal/lib or internal/lib64
configure:36203: error: Can not continue
Can you confirm that HWLOC built? I believe we require it, but perhaps something is different about this environment.
Post by Ralph H Castain
Looks like PMIx failed to build - can you send the config.log?
Post by Siegmar Gross
Hi,
yesterday I've installed openmpi-v4.0.x-201809290241-a7e275c and
openmpi-master-201805080348-b39bbfb on my "SUSE Linux Enterprise Server
12.3 (x86_64)" with Sun C 5.15, gcc 6.4.0, Intel icc 18.0.3, and Portland
Group pgcc 18.4-0. Unfortunately, I get the following error for all seven
installed versions (Sun C couldn't built master as I mentioned in another
email).
loki hello_1 118 mpiexec -np 4 --host loki:2,nfs2:2 hello_1_mpi
[loki:11423] [[45859,0],0] ORTE_ERROR_LOG: Not found in file ../../../../../openmpi-v4.0.x-201809290241-a7e275c/orte/mca/ess/hnp/ess_hnp_module.c at line 321
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
opal_pmix_base_select failed
--> Returned value Not found (-13) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
loki hello_1 119
I would be grateful, if somebody can fix the problem. Do you need anything
else? Thank you very much for any help in advance.
Kind regards
Siegmar
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
--
Jeff Squyres
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Loading...