Discussion:
[OMPI users] openmpi/slurm/pmix
Michael Di Domenico
2018-04-23 19:01:20 UTC
Permalink
i'm trying to get slurm 17.11.5 and openmpi 3.0.1 working with pmix.

everything compiled, but when i run something it get

: symbol lookup error: /openmpi/mca_pmix_pmix2x.so: undefined symbol:
opal_libevent2022_evthread_use_pthreads

i more then sure i did something wrong, but i'm not sure what, here's what i did

compile libevent 2.1.8

./configure --prefix=/libevent-2.1.8

compile pmix 2.1.0

./configure --prefix=/pmix-2.1.0 --with-psm2
--with-munge=/munge-0.5.13 --with-libevent=/libevent-2.1.8

compile openmpi

./configure --prefix=/openmpi-3.0.1 --with-slurm=/slurm-17.11.5
--with-hwloc=external --with-mxm=/opt/mellanox/mxm
--with-cuda=/usr/local/cuda --with-pmix=/pmix-2.1.0
--with-libevent=/libevent-2.1.8

when i look at the symbols in the mca_pmix_pmix2x.so library the
function is indeed undefined (U) in the output, but checking ldd
against the library doesn't show any missing

any thoughts?
r***@open-mpi.org
2018-04-23 22:07:00 UTC
Permalink
Hi Michael

Looks like the problem is that you didn’t wind up with the external PMIx. The component listed in your error is the internal PMIx one which shouldn’t have built given that configure line.

Check your config.out and see what happened. Also, ensure that your LD_LIBRARY_PATH is properly pointing to the installation, and that you built into a “clean” prefix.
Post by Michael Di Domenico
i'm trying to get slurm 17.11.5 and openmpi 3.0.1 working with pmix.
everything compiled, but when i run something it get
opal_libevent2022_evthread_use_pthreads
i more then sure i did something wrong, but i'm not sure what, here's what i did
compile libevent 2.1.8
./configure --prefix=/libevent-2.1.8
compile pmix 2.1.0
./configure --prefix=/pmix-2.1.0 --with-psm2
--with-munge=/munge-0.5.13 --with-libevent=/libevent-2.1.8
compile openmpi
./configure --prefix=/openmpi-3.0.1 --with-slurm=/slurm-17.11.5
--with-hwloc=external --with-mxm=/opt/mellanox/mxm
--with-cuda=/usr/local/cuda --with-pmix=/pmix-2.1.0
--with-libevent=/libevent-2.1.8
when i look at the symbols in the mca_pmix_pmix2x.so library the
function is indeed undefined (U) in the output, but checking ldd
against the library doesn't show any missing
any thoughts?
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Charles A Taylor
2018-04-24 10:05:21 UTC
Permalink
I’ll add that when building OpenMPI 3.0.0 with an external PMIx, I found that the OpenMPI configure script only looks in “lib” for the the pmix library but the pmix configure/build uses “lib64” (as it should on a 64-bit system) so the configure script falls back to the internal PMIx. As Robert suggested, check your config.log for “not found” messages.

In my case, I simply added a “lib -> lib64” symlink in the PMIx installation directory rather than alter the configure script and that did the trick.

Good luck,

Charlie
Post by r***@open-mpi.org
Hi Michael
Looks like the problem is that you didn’t wind up with the external PMIx. The component listed in your error is the internal PMIx one which shouldn’t have built given that configure line.
Check your config.out and see what happened. Also, ensure that your LD_LIBRARY_PATH is properly pointing to the installation, and that you built into a “clean” prefix.
Post by Michael Di Domenico
i'm trying to get slurm 17.11.5 and openmpi 3.0.1 working with pmix.
everything compiled, but when i run something it get
opal_libevent2022_evthread_use_pthreads
i more then sure i did something wrong, but i'm not sure what, here's what i did
compile libevent 2.1.8
./configure --prefix=/libevent-2.1.8
compile pmix 2.1.0
./configure --prefix=/pmix-2.1.0 --with-psm2
--with-munge=/munge-0.5.13 --with-libevent=/libevent-2.1.8
compile openmpi
./configure --prefix=/openmpi-3.0.1 --with-slurm=/slurm-17.11.5
--with-hwloc=external --with-mxm=/opt/mellanox/mxm
--with-cuda=/usr/local/cuda --with-pmix=/pmix-2.1.0
--with-libevent=/libevent-2.1.8
when i look at the symbols in the mca_pmix_pmix2x.so library the
function is indeed undefined (U) in the output, but checking ldd
against the library doesn't show any missing
any thoughts?
_______________________________________________
users mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.open-2Dmpi.org_mailman_listinfo_users&d=DwIGaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=HOtXciFqK5GlgIgLAxthUQ&m=XE6hInyZVJ5VMrO5vdTEKEw3pZBBVnLE7U8Nm67zj2M&s=_sgJVrkRzlv7dIYMvtMfj26AJdbH-fcOOarmN7PyJCI&e=
_______________________________________________
users mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.open-2Dmpi.org_mailman_listinfo_users&d=DwIGaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=HOtXciFqK5GlgIgLAxthUQ&m=XE6hInyZVJ5VMrO5vdTEKEw3pZBBVnLE7U8Nm67zj2M&s=_sgJVrkRzlv7dIYMvtMfj26AJdbH-fcOOarmN7PyJCI&e=
g***@rist.or.jp
2018-04-24 12:07:38 UTC
Permalink
Charles,

have you tried to configure --with-pmix-libdir=/.../lib64 ?

Cheers,

Gilles

----- Original Message -----
Post by Charles A Taylor
I’ll add that when building OpenMPI 3.0.0 with an external PMIx, I
found that the OpenMPI configure script only looks in “lib” for the the
pmix library but the pmix configure/build uses “lib64” (as it should on
a 64-bit system) so the configure script falls back to the internal PMIx.
As Robert suggested, check your config.log for “not found” messages.
Post by Charles A Taylor
In my case, I simply added a “lib -> lib64” symlink in the PMIx
installation directory rather than alter the configure script and that
did the trick.
Post by Charles A Taylor
Good luck,
Charlie
Post by r***@open-mpi.org
Hi Michael
Looks like the problem is that you didn’t wind up with the external
PMIx. The component listed in your error is the internal PMIx one which
shouldn’t have built given that configure line.
Post by Charles A Taylor
Post by r***@open-mpi.org
Check your config.out and see what happened. Also, ensure that your
LD_LIBRARY_PATH is properly pointing to the installation, and that you
built into a “clean” prefix.
Post by Charles A Taylor
Post by r***@open-mpi.org
Post by Michael Di Domenico
i'm trying to get slurm 17.11.5 and openmpi 3.0.1 working with pmix.
everything compiled, but when i run something it get
opal_libevent2022_evthread_use_pthreads
i more then sure i did something wrong, but i'm not sure what, here
's what i did
Post by Charles A Taylor
Post by r***@open-mpi.org
Post by Michael Di Domenico
compile libevent 2.1.8
./configure --prefix=/libevent-2.1.8
compile pmix 2.1.0
./configure --prefix=/pmix-2.1.0 --with-psm2
--with-munge=/munge-0.5.13 --with-libevent=/libevent-2.1.8
compile openmpi
./configure --prefix=/openmpi-3.0.1 --with-slurm=/slurm-17.11.5
--with-hwloc=external --with-mxm=/opt/mellanox/mxm
--with-cuda=/usr/local/cuda --with-pmix=/pmix-2.1.0
--with-libevent=/libevent-2.1.8
when i look at the symbols in the mca_pmix_pmix2x.so library the
function is indeed undefined (U) in the output, but checking ldd
against the library doesn't show any missing
any thoughts?
_______________________________________________
users mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.open-2Dmpi.org_mailman_listinfo_users&d=DwIGaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=HOtXciFqK5GlgIgLAxthUQ&m=XE6hInyZVJ5VMrO5vdTEKEw3pZBBVnLE7U8Nm67zj2M&s=_sgJVrkRzlv7dIYMvtMfj26AJdbH-fcOOarmN7PyJCI&e=
_______________________________________________
users mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.open-2Dmpi.org_mailman_listinfo_users&d=DwIGaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=HOtXciFqK5GlgIgLAxthUQ&m=XE6hInyZVJ5VMrO5vdTEKEw3pZBBVnLE7U8Nm67zj2M&s=_sgJVrkRzlv7dIYMvtMfj26AJdbH-fcOOarmN7PyJCI&e=
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Charles A Taylor
2018-04-24 12:21:02 UTC
Permalink
Hi Gilles,

Yes, I did. It was ignored AFAICT. I did not look for the reason - only so many hours in the day.

Regards,

Charlie
Post by g***@rist.or.jp
Charles,
have you tried to configure --with-pmix-libdir=/.../lib64 ?
Cheers,
Gilles
----- Original Message -----
I´ll add that when building OpenMPI 3.0.0 with an external PMIx, I
found that the OpenMPI configure script only looks in “lib” for the the
pmix library but the pmix configure/build uses “lib64” (as it should on
a 64-bit system) so the configure script falls back to the internal PMIx.
As Robert suggested, check your config.log for “not found” messages.
In my case, I simply added a “lib -> lib64” symlink in the PMIx
installation directory rather than alter the configure script and that
did the trick.
Good luck,
Charlie
Post by r***@open-mpi.org
Hi Michael
Looks like the problem is that you didn´t wind up with the external
PMIx. The component listed in your error is the internal PMIx one which
shouldn´t have built given that configure line.
Post by r***@open-mpi.org
Check your config.out and see what happened. Also, ensure that your
LD_LIBRARY_PATH is properly pointing to the installation, and that you
built into a “clean” prefix.
Post by r***@open-mpi.org
Post by Michael Di Domenico
i'm trying to get slurm 17.11.5 and openmpi 3.0.1 working with pmix.
everything compiled, but when i run something it get
: symbol lookup error: /openmpi/mca_pmix_pmix2x.so: undefined
opal_libevent2022_evthread_use_pthreads
i more then sure i did something wrong, but i'm not sure what, here
's what i did
Post by r***@open-mpi.org
Post by Michael Di Domenico
compile libevent 2.1.8
./configure --prefix=/libevent-2.1.8
compile pmix 2.1.0
./configure --prefix=/pmix-2.1.0 --with-psm2
--with-munge=/munge-0.5.13 --with-libevent=/libevent-2.1.8
compile openmpi
./configure --prefix=/openmpi-3.0.1 --with-slurm=/slurm-17.11.5
--with-hwloc=external --with-mxm=/opt/mellanox/mxm
--with-cuda=/usr/local/cuda --with-pmix=/pmix-2.1.0
--with-libevent=/libevent-2.1.8
when i look at the symbols in the mca_pmix_pmix2x.so library the
function is indeed undefined (U) in the output, but checking ldd
against the library doesn't show any missing
any thoughts?
_______________________________________________
users mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.open-2Dmpi.org_mailman_listinfo_users&d=DwIGaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=HOtXciFqK5GlgIgLAxthUQ&m=XE6hInyZVJ5VMrO5vdTEKEw3pZBBVnLE7U8Nm67zj2M&s=_sgJVrkRzlv7dIYMvtMfj26AJdbH-fcOOarmN7PyJCI&e=
_______________________________________________
users mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.open-2Dmpi.org_mailman_listinfo_users&d=DwIGaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=HOtXciFqK5GlgIgLAxthUQ&m=XE6hInyZVJ5VMrO5vdTEKEw3pZBBVnLE7U8Nm67zj2M&s=_sgJVrkRzlv7dIYMvtMfj26AJdbH-fcOOarmN7PyJCI&e=
_______________________________________________
users mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.open-2Dmpi.org_mailman_listinfo_users&d=DwIFag&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=HOtXciFqK5GlgIgLAxthUQ&m=0XUVnlQfzGhlRDSBAm8nGvZt27jITo3r1oX9_vg639w&s=ErD6RckR-Uvdpj4CTtNvT9iZck285Vdf6sgYskQ_Z-k&e=
_______________________________________________
users mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.open-2Dmpi.org_mailman_listinfo_users&d=DwIFag&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=HOtXciFqK5GlgIgLAxthUQ&m=0XUVnlQfzGhlRDSBAm8nGvZt27jITo3r1oX9_vg639w&s=ErD6RckR-Uvdpj4CTtNvT9iZck285Vdf6sgYskQ_Z-k&e=
Michael Di Domenico
2018-04-25 15:16:48 UTC
Permalink
Post by r***@open-mpi.org
Looks like the problem is that you didn’t wind up with the external PMIx. The component listed in your error is the internal PMIx one which shouldn’t have built given that configure line.
Check your config.out and see what happened. Also, ensure that your LD_LIBRARY_PATH is properly pointing to the installation, and that you built into a “clean” prefix.
the "clean prefix" part seemed to fix my issue. i'm not exactly sure
i understand why/how though. i recompiled pmix and removed the old
installation before doing a make install

when i recompiled openmpi it seems to have figured itself out

i think things are still a little wonky, but at least that issue is gone
r***@open-mpi.org
2018-04-25 15:38:31 UTC
Permalink
Post by Michael Di Domenico
Post by r***@open-mpi.org
Looks like the problem is that you didn’t wind up with the external PMIx. The component listed in your error is the internal PMIx one which shouldn’t have built given that configure line.
Check your config.out and see what happened. Also, ensure that your LD_LIBRARY_PATH is properly pointing to the installation, and that you built into a “clean” prefix.
the "clean prefix" part seemed to fix my issue. i'm not exactly sure
i understand why/how though. i recompiled pmix and removed the old
installation before doing a make install
When you build, we don’t automatically purge the prefix location of any prior libraries. Thus, the old install of the internal PMIx library was still present. It has a higher priority than the external components, and so it was being picked up and used.

Starting clean removed it, leaving the external component to be selected.
Post by Michael Di Domenico
when i recompiled openmpi it seems to have figured itself out
i think things are still a little wonky, but at least that issue is gone
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Loading...