Discussion:
[OMPI users] Old version openmpi 1.2 support infiniband?
Kaiming Ouyang
2018-03-20 01:29:18 UTC
Permalink
Hi everyone,
Recently I need to compile High-Performance Linpack code with openmpi 1.2
version (a little bit old). When I finish compilation, and try to run, I
get the following errors:

[test:32058] *** Process received signal ***
[test:32058] Signal: Segmentation fault (11)
[test:32058] Signal code: Address not mapped (1)
[test:32058] Failing at address: 0x14a2b84b6304
[test:32058] [ 0] /lib64/libpthread.so.0(+0xf5e0) [0x14eb116295e0]
[test:32058] [ 1] /root/research/lib/openmpi-1.
2.9/lib/openmpi/mca_btl_sm.so(mca_btl_sm_component_progress+0x28a)
[0x14eaa81258aa]
[test:32058] [ 2] /root/research/lib/openmpi-1.
2.9/lib/openmpi/mca_bml_r2.so(mca_bml_r2_progress+0x2b) [0x14eaa853219b]
[test:32058] [ 3] /root/research/lib/openmpi-1.
2.9/lib/libopen-pal.so.0(opal_progress+0x4a) [0x14eb128dbaaa]
[test:32058] [ 4]
/root/research/lib/openmpi-1.2.9/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_msg_wait+0x1d)
[0x14eaf41e6b4d]
[test:32058] [ 5]
/root/research/lib/openmpi-1.2.9/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_recv+0x3a5)
[0x14eaf41eac45]
[test:32058] [ 6]
/root/research/lib/openmpi-1.2.9/lib/libopen-rte.so.0(mca_oob_recv_packed+0x33)
[0x14eb12b62223]
[test:32058] [ 7] /root/research/lib/openmpi-1.
2.9/lib/openmpi/mca_gpr_proxy.so(orte_gpr_proxy_put+0x1f9) [0x14eaf3dd7db9]
[test:32058] [ 8] /root/research/lib/openmpi-1.
2.9/lib/libopen-rte.so.0(orte_smr_base_set_proc_state+0x31d)
[0x14eb12b7893d]
[test:32058] [ 9]
/root/research/lib/openmpi-1.2.9/lib/libmpi.so.0(ompi_mpi_init+0x8d6)
[0x14eb13202136]
[test:32058] [10]
/root/research/lib/openmpi-1.2.9/lib/libmpi.so.0(MPI_Init+0x6a)
[0x14eb1322461a]
[test:32058] [11] ./xhpl(main+0x5d) [0x404e7d]
[test:32058] [12] /lib64/libc.so.6(__libc_start_main+0xf5) [0x14eb11278c05]
[test:32058] [13] ./xhpl() [0x4056cb]
[test:32058] *** End of error message ***
mpirun noticed that job rank 0 with PID 31481 on node test.novalocal exited
on signal 15 (Terminated).
23 additional processes aborted (not shown)

The machine has infiniband, so I doubt whether openmpi 1.2 does not support
infiniband by default. I also try to run it not through infiniband, but the
program can only deal with small size input. When I increase the input size
and grid size, it just gets stuck. The program I run is a benchmark, so I
don't think there would be a problem in the code. Any idea? Thanks.
Jeff Squyres (jsquyres)
2018-03-20 02:35:40 UTC
Permalink
That's actually failing in a shared memory section of the code.

But to answer your question, yes, Open MPI 1.2 did have IB support.

That being said, I have no idea what would cause this shared memory segv -- it's quite possible that it's simple bit rot (i.e., v1.2.9 was released 9 years ago -- see https://www.open-mpi.org/software/ompi/versions/timeline.php. Perhaps it does not function correctly on modern glibc/Linux kernel-based platforms).

Can you upgrade to a [much] newer Open MPI?
Post by Kaiming Ouyang
Hi everyone,
[test:32058] *** Process received signal ***
[test:32058] Signal: Segmentation fault (11)
[test:32058] Signal code: Address not mapped (1)
[test:32058] Failing at address: 0x14a2b84b6304
[test:32058] [ 0] /lib64/libpthread.so.0(+0xf5e0) [0x14eb116295e0]
[test:32058] [ 1] /root/research/lib/openmpi-1.2.9/lib/openmpi/mca_btl_sm.so(mca_btl_sm_component_progress+0x28a) [0x14eaa81258aa]
[test:32058] [ 2] /root/research/lib/openmpi-1.2.9/lib/openmpi/mca_bml_r2.so(mca_bml_r2_progress+0x2b) [0x14eaa853219b]
[test:32058] [ 3] /root/research/lib/openmpi-1.2.9/lib/libopen-pal.so.0(opal_progress+0x4a) [0x14eb128dbaaa]
[test:32058] [ 4] /root/research/lib/openmpi-1.2.9/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_msg_wait+0x1d) [0x14eaf41e6b4d]
[test:32058] [ 5] /root/research/lib/openmpi-1.2.9/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_recv+0x3a5) [0x14eaf41eac45]
[test:32058] [ 6] /root/research/lib/openmpi-1.2.9/lib/libopen-rte.so.0(mca_oob_recv_packed+0x33) [0x14eb12b62223]
[test:32058] [ 7] /root/research/lib/openmpi-1.2.9/lib/openmpi/mca_gpr_proxy.so(orte_gpr_proxy_put+0x1f9) [0x14eaf3dd7db9]
[test:32058] [ 8] /root/research/lib/openmpi-1.2.9/lib/libopen-rte.so.0(orte_smr_base_set_proc_state+0x31d) [0x14eb12b7893d]
[test:32058] [ 9] /root/research/lib/openmpi-1.2.9/lib/libmpi.so.0(ompi_mpi_init+0x8d6) [0x14eb13202136]
[test:32058] [10] /root/research/lib/openmpi-1.2.9/lib/libmpi.so.0(MPI_Init+0x6a) [0x14eb1322461a]
[test:32058] [11] ./xhpl(main+0x5d) [0x404e7d]
[test:32058] [12] /lib64/libc.so.6(__libc_start_main+0xf5) [0x14eb11278c05]
[test:32058] [13] ./xhpl() [0x4056cb]
[test:32058] *** End of error message ***
mpirun noticed that job rank 0 with PID 31481 on node test.novalocal exited on signal 15 (Terminated).
23 additional processes aborted (not shown)
The machine has infiniband, so I doubt whether openmpi 1.2 does not support infiniband by default. I also try to run it not through infiniband, but the program can only deal with small size input. When I increase the input size and grid size, it just gets stuck. The program I run is a benchmark, so I don't think there would be a problem in the code. Any idea? Thanks.
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
--
Jeff Squyres
***@cisco.com
Kaiming Ouyang
2018-03-20 02:59:50 UTC
Permalink
Hi Jeff,
Thank you for your reply. I just changed to another cluster which does not
have infiniband. I ran the HPL by:
mpirun *--mca btl tcp,self* -np 144 --hostfile /root/research/hostfile
./xhpl

It ran successfully, but if I delete "--mca btl tcp,self", it cannot run
again. So I doubt whether openmpi 1.2 cannot identify the proper network
interface and set correct parameters for them.
Then, I return back to the previous cluster with infiniband and type the
same command above. It gets stuck forever.

I change the command to:
mpirun *--mca btl_tcp_if_include ib0* --hostfile /root/research/hostfile-ib
-np 48 ./xhpl

It can successfully launch but gives me errors as follows when HPL tries to
split the communication:

[node1.novalocal:09562] *** An error occurred in MPI_Comm_split
[node1.novalocal:09562] *** on communicator MPI COMMUNICATOR 3 SPLIT FROM 0
[node1.novalocal:09562] *** MPI_ERR_IN_STATUS: error code in status
[node1.novalocal:09562] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node1.novalocal:09583] *** An error occurred in MPI_Comm_split
[node1.novalocal:09583] *** on communicator MPI COMMUNICATOR 3 SPLIT FROM 0
[node1.novalocal:09583] *** MPI_ERR_IN_STATUS: error code in status
[node1.novalocal:09583] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node1.novalocal:09637] *** An error occurred in MPI_Comm_split
[node1.novalocal:09637] *** on communicator MPI COMMUNICATOR 3 SPLIT FROM 0
[node1.novalocal:09637] *** MPI_ERR_IN_STATUS: error code in status
[node1.novalocal:09637] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node1.novalocal:09994] *** An error occurred in MPI_Comm_split
[node1.novalocal:09994] *** on communicator MPI COMMUNICATOR 3 SPLIT FROM 0
[node1.novalocal:09994] *** MPI_ERR_IN_STATUS: error code in status
[node1.novalocal:09994] *** MPI_ERRORS_ARE_FATAL (goodbye)
mpirun noticed that job rank 0 with PID 46005 on node test-ib exited on
signal 15 (Terminated).

Hope you can give me some suggestions. Thank you.

Kaiming Ouyang, Research Assistant.
Department of Computer Science and Engineering
University of California, Riverside
900 University Avenue, Riverside, CA 92521
Post by Jeff Squyres (jsquyres)
That's actually failing in a shared memory section of the code.
But to answer your question, yes, Open MPI 1.2 did have IB support.
That being said, I have no idea what would cause this shared memory segv
-- it's quite possible that it's simple bit rot (i.e., v1.2.9 was released
9 years ago -- see https://www.open-mpi.org/software/ompi/versions/
timeline.php. Perhaps it does not function correctly on modern
glibc/Linux kernel-based platforms).
Can you upgrade to a [much] newer Open MPI?
Post by Kaiming Ouyang
Hi everyone,
Recently I need to compile High-Performance Linpack code with openmpi
1.2 version (a little bit old). When I finish compilation, and try to run,
Post by Kaiming Ouyang
[test:32058] *** Process received signal ***
[test:32058] Signal: Segmentation fault (11)
[test:32058] Signal code: Address not mapped (1)
[test:32058] Failing at address: 0x14a2b84b6304
[test:32058] [ 0] /lib64/libpthread.so.0(+0xf5e0) [0x14eb116295e0]
[test:32058] [ 1] /root/research/lib/openmpi-1.
2.9/lib/openmpi/mca_btl_sm.so(mca_btl_sm_component_progress+0x28a)
[0x14eaa81258aa]
Post by Kaiming Ouyang
[test:32058] [ 2] /root/research/lib/openmpi-1.
2.9/lib/openmpi/mca_bml_r2.so(mca_bml_r2_progress+0x2b) [0x14eaa853219b]
Post by Kaiming Ouyang
[test:32058] [ 3] /root/research/lib/openmpi-1.
2.9/lib/libopen-pal.so.0(opal_progress+0x4a) [0x14eb128dbaaa]
Post by Kaiming Ouyang
[test:32058] [ 4] /root/research/lib/openmpi-1.
2.9/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_msg_wait+0x1d) [0x14eaf41e6b4d]
Post by Kaiming Ouyang
[test:32058] [ 5] /root/research/lib/openmpi-1.
2.9/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_recv+0x3a5) [0x14eaf41eac45]
Post by Kaiming Ouyang
[test:32058] [ 6] /root/research/lib/openmpi-1.
2.9/lib/libopen-rte.so.0(mca_oob_recv_packed+0x33) [0x14eb12b62223]
Post by Kaiming Ouyang
[test:32058] [ 7] /root/research/lib/openmpi-1.
2.9/lib/openmpi/mca_gpr_proxy.so(orte_gpr_proxy_put+0x1f9)
[0x14eaf3dd7db9]
Post by Kaiming Ouyang
[test:32058] [ 8] /root/research/lib/openmpi-1.
2.9/lib/libopen-rte.so.0(orte_smr_base_set_proc_state+0x31d)
[0x14eb12b7893d]
Post by Kaiming Ouyang
[test:32058] [ 9] /root/research/lib/openmpi-1.
2.9/lib/libmpi.so.0(ompi_mpi_init+0x8d6) [0x14eb13202136]
Post by Kaiming Ouyang
[test:32058] [10] /root/research/lib/openmpi-1.
2.9/lib/libmpi.so.0(MPI_Init+0x6a) [0x14eb1322461a]
Post by Kaiming Ouyang
[test:32058] [11] ./xhpl(main+0x5d) [0x404e7d]
[test:32058] [12] /lib64/libc.so.6(__libc_start_main+0xf5)
[0x14eb11278c05]
Post by Kaiming Ouyang
[test:32058] [13] ./xhpl() [0x4056cb]
[test:32058] *** End of error message ***
mpirun noticed that job rank 0 with PID 31481 on node test.novalocal
exited on signal 15 (Terminated).
Post by Kaiming Ouyang
23 additional processes aborted (not shown)
The machine has infiniband, so I doubt whether openmpi 1.2 does not
support infiniband by default. I also try to run it not through infiniband,
but the program can only deal with small size input. When I increase the
input size and grid size, it just gets stuck. The program I run is a
benchmark, so I don't think there would be a problem in the code. Any idea?
Thanks.
Post by Kaiming Ouyang
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
--
Jeff Squyres
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Jeff Squyres (jsquyres)
2018-03-20 03:39:52 UTC
Permalink
I'm sorry; I can't help debug a version from 9 years ago. The best suggestion I have is to use a modern version of Open MPI.

Note, however, your use of "--mca btl ..." is going to have the same meaning for all versions of Open MPI. The problem you showed in the first mail was with the shared memory transport. Using "--mca btl tcp,self" means you're not using the shared memory transport. If you don't specify "--mca btl tcp,self", Open MPI will automatically use the shared memory transport. Hence, you could be running into the same (or similar/related) problem that you mentioned in the first mail -- i.e., something is going wrong with how the v1.2.9 shared memory transport is interacting with your system.

Likewise, "--mca btl_tcp_if_include ib0" tells the TCP BTL plugin to use the "ib0" network. But if you have the openib BTL available (i.e., the IB-native plug), that will be used instead of the TCP BTL because native verbs over IB performs much better than TCP over IB. Meaning: if you specify btl_Tcp_if_include without specifying "--mca btl tcp,self", then (assuming openib is available) the TCP BTL likely isn't used and the btl_tcp_if_include value is therefore ignored.

Also, what version of Linpack are you using? The error you show is usually indicative of an MPI application bug (the MPI_COMM_SPLIT error). If you're running an old version of xhpl, you should upgrade to the latest.
Post by Kaiming Ouyang
Hi Jeff,
mpirun --mca btl tcp,self -np 144 --hostfile /root/research/hostfile ./xhpl
It ran successfully, but if I delete "--mca btl tcp,self", it cannot run again. So I doubt whether openmpi 1.2 cannot identify the proper network interface and set correct parameters for them.
Then, I return back to the previous cluster with infiniband and type the same command above. It gets stuck forever.
mpirun --mca btl_tcp_if_include ib0 --hostfile /root/research/hostfile-ib -np 48 ./xhpl
[node1.novalocal:09562] *** An error occurred in MPI_Comm_split
[node1.novalocal:09562] *** on communicator MPI COMMUNICATOR 3 SPLIT FROM 0
[node1.novalocal:09562] *** MPI_ERR_IN_STATUS: error code in status
[node1.novalocal:09562] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node1.novalocal:09583] *** An error occurred in MPI_Comm_split
[node1.novalocal:09583] *** on communicator MPI COMMUNICATOR 3 SPLIT FROM 0
[node1.novalocal:09583] *** MPI_ERR_IN_STATUS: error code in status
[node1.novalocal:09583] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node1.novalocal:09637] *** An error occurred in MPI_Comm_split
[node1.novalocal:09637] *** on communicator MPI COMMUNICATOR 3 SPLIT FROM 0
[node1.novalocal:09637] *** MPI_ERR_IN_STATUS: error code in status
[node1.novalocal:09637] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node1.novalocal:09994] *** An error occurred in MPI_Comm_split
[node1.novalocal:09994] *** on communicator MPI COMMUNICATOR 3 SPLIT FROM 0
[node1.novalocal:09994] *** MPI_ERR_IN_STATUS: error code in status
[node1.novalocal:09994] *** MPI_ERRORS_ARE_FATAL (goodbye)
mpirun noticed that job rank 0 with PID 46005 on node test-ib exited on signal 15 (Terminated).
Hope you can give me some suggestions. Thank you.
Kaiming Ouyang, Research Assistant.
Department of Computer Science and Engineering
University of California, Riverside
900 University Avenue, Riverside, CA 92521
That's actually failing in a shared memory section of the code.
But to answer your question, yes, Open MPI 1.2 did have IB support.
That being said, I have no idea what would cause this shared memory segv -- it's quite possible that it's simple bit rot (i.e., v1.2.9 was released 9 years ago -- see https://www.open-mpi.org/software/ompi/versions/timeline.php. Perhaps it does not function correctly on modern glibc/Linux kernel-based platforms).
Can you upgrade to a [much] newer Open MPI?
Post by Kaiming Ouyang
Hi everyone,
[test:32058] *** Process received signal ***
[test:32058] Signal: Segmentation fault (11)
[test:32058] Signal code: Address not mapped (1)
[test:32058] Failing at address: 0x14a2b84b6304
[test:32058] [ 0] /lib64/libpthread.so.0(+0xf5e0) [0x14eb116295e0]
[test:32058] [ 1] /root/research/lib/openmpi-1.2.9/lib/openmpi/mca_btl_sm.so(mca_btl_sm_component_progress+0x28a) [0x14eaa81258aa]
[test:32058] [ 2] /root/research/lib/openmpi-1.2.9/lib/openmpi/mca_bml_r2.so(mca_bml_r2_progress+0x2b) [0x14eaa853219b]
[test:32058] [ 3] /root/research/lib/openmpi-1.2.9/lib/libopen-pal.so.0(opal_progress+0x4a) [0x14eb128dbaaa]
[test:32058] [ 4] /root/research/lib/openmpi-1.2.9/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_msg_wait+0x1d) [0x14eaf41e6b4d]
[test:32058] [ 5] /root/research/lib/openmpi-1.2.9/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_recv+0x3a5) [0x14eaf41eac45]
[test:32058] [ 6] /root/research/lib/openmpi-1.2.9/lib/libopen-rte.so.0(mca_oob_recv_packed+0x33) [0x14eb12b62223]
[test:32058] [ 7] /root/research/lib/openmpi-1.2.9/lib/openmpi/mca_gpr_proxy.so(orte_gpr_proxy_put+0x1f9) [0x14eaf3dd7db9]
[test:32058] [ 8] /root/research/lib/openmpi-1.2.9/lib/libopen-rte.so.0(orte_smr_base_set_proc_state+0x31d) [0x14eb12b7893d]
[test:32058] [ 9] /root/research/lib/openmpi-1.2.9/lib/libmpi.so.0(ompi_mpi_init+0x8d6) [0x14eb13202136]
[test:32058] [10] /root/research/lib/openmpi-1.2.9/lib/libmpi.so.0(MPI_Init+0x6a) [0x14eb1322461a]
[test:32058] [11] ./xhpl(main+0x5d) [0x404e7d]
[test:32058] [12] /lib64/libc.so.6(__libc_start_main+0xf5) [0x14eb11278c05]
[test:32058] [13] ./xhpl() [0x4056cb]
[test:32058] *** End of error message ***
mpirun noticed that job rank 0 with PID 31481 on node test.novalocal exited on signal 15 (Terminated).
23 additional processes aborted (not shown)
The machine has infiniband, so I doubt whether openmpi 1.2 does not support infiniband by default. I also try to run it not through infiniband, but the program can only deal with small size input. When I increase the input size and grid size, it just gets stuck. The program I run is a benchmark, so I don't think there would be a problem in the code. Any idea? Thanks.
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
--
Jeff Squyres
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
--
Jeff Squyres
***@cisco.com
Kaiming Ouyang
2018-03-20 04:32:49 UTC
Permalink
Thank you.
I am using newest version HPL.
I forgot to say I can run HPL with openmpi-3.0 under infiniband. The reason
I want to use old version is I need to compile a library that only supports
old version openmpi, so I am trying to do this tricky job. Anyways, thank
you for your reply Jeff, have a good day.

Kaiming Ouyang, Research Assistant.
Department of Computer Science and Engineering
University of California, Riverside
900 University Avenue, Riverside, CA 92521
Post by Jeff Squyres (jsquyres)
I'm sorry; I can't help debug a version from 9 years ago. The best
suggestion I have is to use a modern version of Open MPI.
Note, however, your use of "--mca btl ..." is going to have the same
meaning for all versions of Open MPI. The problem you showed in the first
mail was with the shared memory transport. Using "--mca btl tcp,self"
means you're not using the shared memory transport. If you don't specify
"--mca btl tcp,self", Open MPI will automatically use the shared memory
transport. Hence, you could be running into the same (or similar/related)
problem that you mentioned in the first mail -- i.e., something is going
wrong with how the v1.2.9 shared memory transport is interacting with your
system.
Likewise, "--mca btl_tcp_if_include ib0" tells the TCP BTL plugin to use
the "ib0" network. But if you have the openib BTL available (i.e., the
IB-native plug), that will be used instead of the TCP BTL because native
verbs over IB performs much better than TCP over IB. Meaning: if you
specify btl_Tcp_if_include without specifying "--mca btl tcp,self", then
(assuming openib is available) the TCP BTL likely isn't used and the
btl_tcp_if_include value is therefore ignored.
Also, what version of Linpack are you using? The error you show is
usually indicative of an MPI application bug (the MPI_COMM_SPLIT error).
If you're running an old version of xhpl, you should upgrade to the latest.
Post by Kaiming Ouyang
Hi Jeff,
Thank you for your reply. I just changed to another cluster which does
mpirun --mca btl tcp,self -np 144 --hostfile /root/research/hostfile
./xhpl
Post by Kaiming Ouyang
It ran successfully, but if I delete "--mca btl tcp,self", it cannot run
again. So I doubt whether openmpi 1.2 cannot identify the proper network
interface and set correct parameters for them.
Post by Kaiming Ouyang
Then, I return back to the previous cluster with infiniband and type the
same command above. It gets stuck forever.
Post by Kaiming Ouyang
mpirun --mca btl_tcp_if_include ib0 --hostfile
/root/research/hostfile-ib -np 48 ./xhpl
Post by Kaiming Ouyang
It can successfully launch but gives me errors as follows when HPL tries
[node1.novalocal:09562] *** An error occurred in MPI_Comm_split
[node1.novalocal:09562] *** on communicator MPI COMMUNICATOR 3 SPLIT
FROM 0
Post by Kaiming Ouyang
[node1.novalocal:09562] *** MPI_ERR_IN_STATUS: error code in status
[node1.novalocal:09562] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node1.novalocal:09583] *** An error occurred in MPI_Comm_split
[node1.novalocal:09583] *** on communicator MPI COMMUNICATOR 3 SPLIT
FROM 0
Post by Kaiming Ouyang
[node1.novalocal:09583] *** MPI_ERR_IN_STATUS: error code in status
[node1.novalocal:09583] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node1.novalocal:09637] *** An error occurred in MPI_Comm_split
[node1.novalocal:09637] *** on communicator MPI COMMUNICATOR 3 SPLIT
FROM 0
Post by Kaiming Ouyang
[node1.novalocal:09637] *** MPI_ERR_IN_STATUS: error code in status
[node1.novalocal:09637] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node1.novalocal:09994] *** An error occurred in MPI_Comm_split
[node1.novalocal:09994] *** on communicator MPI COMMUNICATOR 3 SPLIT
FROM 0
Post by Kaiming Ouyang
[node1.novalocal:09994] *** MPI_ERR_IN_STATUS: error code in status
[node1.novalocal:09994] *** MPI_ERRORS_ARE_FATAL (goodbye)
mpirun noticed that job rank 0 with PID 46005 on node test-ib exited on
signal 15 (Terminated).
Post by Kaiming Ouyang
Hope you can give me some suggestions. Thank you.
Kaiming Ouyang, Research Assistant.
Department of Computer Science and Engineering
University of California, Riverside
900 University Avenue, Riverside, CA 92521
On Mon, Mar 19, 2018 at 7:35 PM, Jeff Squyres (jsquyres) <
That's actually failing in a shared memory section of the code.
But to answer your question, yes, Open MPI 1.2 did have IB support.
That being said, I have no idea what would cause this shared memory segv
-- it's quite possible that it's simple bit rot (i.e., v1.2.9 was released
9 years ago -- see https://www.open-mpi.org/software/ompi/versions/
timeline.php. Perhaps it does not function correctly on modern
glibc/Linux kernel-based platforms).
Post by Kaiming Ouyang
Can you upgrade to a [much] newer Open MPI?
Post by Kaiming Ouyang
Hi everyone,
Recently I need to compile High-Performance Linpack code with openmpi
1.2 version (a little bit old). When I finish compilation, and try to run,
Post by Kaiming Ouyang
Post by Kaiming Ouyang
[test:32058] *** Process received signal ***
[test:32058] Signal: Segmentation fault (11)
[test:32058] Signal code: Address not mapped (1)
[test:32058] Failing at address: 0x14a2b84b6304
[test:32058] [ 0] /lib64/libpthread.so.0(+0xf5e0) [0x14eb116295e0]
[test:32058] [ 1] /root/research/lib/openmpi-1.
2.9/lib/openmpi/mca_btl_sm.so(mca_btl_sm_component_progress+0x28a)
[0x14eaa81258aa]
Post by Kaiming Ouyang
Post by Kaiming Ouyang
[test:32058] [ 2] /root/research/lib/openmpi-1.
2.9/lib/openmpi/mca_bml_r2.so(mca_bml_r2_progress+0x2b) [0x14eaa853219b]
Post by Kaiming Ouyang
Post by Kaiming Ouyang
[test:32058] [ 3] /root/research/lib/openmpi-1.
2.9/lib/libopen-pal.so.0(opal_progress+0x4a) [0x14eb128dbaaa]
Post by Kaiming Ouyang
Post by Kaiming Ouyang
[test:32058] [ 4] /root/research/lib/openmpi-1.
2.9/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_msg_wait+0x1d) [0x14eaf41e6b4d]
Post by Kaiming Ouyang
Post by Kaiming Ouyang
[test:32058] [ 5] /root/research/lib/openmpi-1.
2.9/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_recv+0x3a5) [0x14eaf41eac45]
Post by Kaiming Ouyang
Post by Kaiming Ouyang
[test:32058] [ 6] /root/research/lib/openmpi-1.
2.9/lib/libopen-rte.so.0(mca_oob_recv_packed+0x33) [0x14eb12b62223]
Post by Kaiming Ouyang
Post by Kaiming Ouyang
[test:32058] [ 7] /root/research/lib/openmpi-1.
2.9/lib/openmpi/mca_gpr_proxy.so(orte_gpr_proxy_put+0x1f9)
[0x14eaf3dd7db9]
Post by Kaiming Ouyang
Post by Kaiming Ouyang
[test:32058] [ 8] /root/research/lib/openmpi-1.
2.9/lib/libopen-rte.so.0(orte_smr_base_set_proc_state+0x31d)
[0x14eb12b7893d]
Post by Kaiming Ouyang
Post by Kaiming Ouyang
[test:32058] [ 9] /root/research/lib/openmpi-1.
2.9/lib/libmpi.so.0(ompi_mpi_init+0x8d6) [0x14eb13202136]
Post by Kaiming Ouyang
Post by Kaiming Ouyang
[test:32058] [10] /root/research/lib/openmpi-1.
2.9/lib/libmpi.so.0(MPI_Init+0x6a) [0x14eb1322461a]
Post by Kaiming Ouyang
Post by Kaiming Ouyang
[test:32058] [11] ./xhpl(main+0x5d) [0x404e7d]
[test:32058] [12] /lib64/libc.so.6(__libc_start_main+0xf5)
[0x14eb11278c05]
Post by Kaiming Ouyang
Post by Kaiming Ouyang
[test:32058] [13] ./xhpl() [0x4056cb]
[test:32058] *** End of error message ***
mpirun noticed that job rank 0 with PID 31481 on node test.novalocal
exited on signal 15 (Terminated).
Post by Kaiming Ouyang
Post by Kaiming Ouyang
23 additional processes aborted (not shown)
The machine has infiniband, so I doubt whether openmpi 1.2 does not
support infiniband by default. I also try to run it not through infiniband,
but the program can only deal with small size input. When I increase the
input size and grid size, it just gets stuck. The program I run is a
benchmark, so I don't think there would be a problem in the code. Any idea?
Thanks.
Post by Kaiming Ouyang
Post by Kaiming Ouyang
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
--
Jeff Squyres
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
--
Jeff Squyres
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Jeff Squyres (jsquyres)
2018-03-20 11:35:55 UTC
Permalink
Post by Kaiming Ouyang
Thank you.
I am using newest version HPL.
I forgot to say I can run HPL with openmpi-3.0 under infiniband. The reason I want to use old version is I need to compile a library that only supports old version openmpi, so I am trying to do this tricky job.
Gotcha.

Is there something in particular about the old library that requires Open MPI v1.2.x?

More specifically: is there a particular error you get when you try to use Open MPI v3.0.0 with that library?

I ask because if the app supports the MPI API in Open MPI v1.2.9, then it also supports the MPI API in Open MPI v3.0.0. We *have* changed lots of other things under the covers in that time, such as:

- how those MPI API's are implemented
- mpirun (and friends) command line parameters
- MCA parameters
- compilation flags

But many of those things might actually be mostly -- if not entirely -- hidden from a library that uses MPI.

My point: it may be easier to get your library to use a newer version of Open MPI than you think. For example, if the library has some hard-coded flags in their configure/Makefile to build with Open MPI, just replace those flags with `mpicc --showme:BLAH` variants (see `mpicc --showme:help` for a full listing). This will have Open MPI tell you exactly what flags it needs to compile, link, etc.
--
Jeff Squyres
***@cisco.com
Kaiming Ouyang
2018-03-20 16:48:49 UTC
Permalink
I think the problem it has is it only deal with the old framework because
it will intercept MPI calls and do some profiling. Here is the library:
https://github.com/LLNL/Adagio

I checked the openmpi changelog. From openmpi 1.3, it began to switch to a
new framework, and openmpi 1.4+ has different one too. This library only
works under openmpi 1.2.
Thank you for your advise, I will try it. My current problem is this
library seems to try to patch mpi.h file, but it fails during the patching
process for new version openmpi. I don't know the reason yet, and will
check it soon. Thank you.

Kaiming Ouyang, Research Assistant.
Department of Computer Science and Engineering
University of California, Riverside
900 University Avenue, Riverside, CA 92521
Post by Kaiming Ouyang
Post by Kaiming Ouyang
Thank you.
I am using newest version HPL.
I forgot to say I can run HPL with openmpi-3.0 under infiniband. The
reason I want to use old version is I need to compile a library that only
supports old version openmpi, so I am trying to do this tricky job.
Gotcha.
Is there something in particular about the old library that requires Open MPI v1.2.x?
More specifically: is there a particular error you get when you try to use
Open MPI v3.0.0 with that library?
I ask because if the app supports the MPI API in Open MPI v1.2.9, then it
also supports the MPI API in Open MPI v3.0.0. We *have* changed lots of
- how those MPI API's are implemented
- mpirun (and friends) command line parameters
- MCA parameters
- compilation flags
But many of those things might actually be mostly -- if not entirely --
hidden from a library that uses MPI.
My point: it may be easier to get your library to use a newer version of
Open MPI than you think. For example, if the library has some hard-coded
flags in their configure/Makefile to build with Open MPI, just replace
those flags with `mpicc --showme:BLAH` variants (see `mpicc --showme:help`
for a full listing). This will have Open MPI tell you exactly what flags
it needs to compile, link, etc.
--
Jeff Squyres
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
John Hearns via users
2018-03-20 17:46:15 UTC
Permalink
"It does not handle more recent improvements such as Intel's turbo
mode and the processor performance inhomogeneity that comes with it."
I guess it is easy enough to disable Turbo mode in the BIOS though.
Post by Kaiming Ouyang
I think the problem it has is it only deal with the old framework because
https://github.com/LLNL/Adagio
I checked the openmpi changelog. From openmpi 1.3, it began to switch to a
new framework, and openmpi 1.4+ has different one too. This library only
works under openmpi 1.2.
Thank you for your advise, I will try it. My current problem is this
library seems to try to patch mpi.h file, but it fails during the patching
process for new version openmpi. I don't know the reason yet, and will
check it soon. Thank you.
Kaiming Ouyang, Research Assistant.
Department of Computer Science and Engineering
University of California, Riverside
900 University Avenue, Riverside, CA 92521
On Tue, Mar 20, 2018 at 4:35 AM, Jeff Squyres (jsquyres) <
Post by Kaiming Ouyang
Post by Kaiming Ouyang
Thank you.
I am using newest version HPL.
I forgot to say I can run HPL with openmpi-3.0 under infiniband. The
reason I want to use old version is I need to compile a library that only
supports old version openmpi, so I am trying to do this tricky job.
Gotcha.
Is there something in particular about the old library that requires Open MPI v1.2.x?
More specifically: is there a particular error you get when you try to
use Open MPI v3.0.0 with that library?
I ask because if the app supports the MPI API in Open MPI v1.2.9, then it
also supports the MPI API in Open MPI v3.0.0. We *have* changed lots of
- how those MPI API's are implemented
- mpirun (and friends) command line parameters
- MCA parameters
- compilation flags
But many of those things might actually be mostly -- if not entirely --
hidden from a library that uses MPI.
My point: it may be easier to get your library to use a newer version of
Open MPI than you think. For example, if the library has some hard-coded
flags in their configure/Makefile to build with Open MPI, just replace
those flags with `mpicc --showme:BLAH` variants (see `mpicc --showme:help`
for a full listing). This will have Open MPI tell you exactly what flags
it needs to compile, link, etc.
--
Jeff Squyres
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Kaiming Ouyang
2018-03-20 18:34:15 UTC
Permalink
Hi John,
Thank you for your advice. But this is only related to its functionality,
and right now my problem is it cannot compile with new version openmpi.
The reason may come from its patch file since it needs to intercept MPI
calls to profile some data. New version openmpi may change its framework so
that this old software does not fit it anymore.


Kaiming Ouyang, Research Assistant.
Department of Computer Science and Engineering
University of California, Riverside
900 University Avenue, Riverside, CA 92521


On Tue, Mar 20, 2018 at 10:46 AM, John Hearns via users <
Post by John Hearns via users
"It does not handle more recent improvements such as Intel's turbo
mode and the processor performance inhomogeneity that comes with it."
I guess it is easy enough to disable Turbo mode in the BIOS though.
Post by Kaiming Ouyang
I think the problem it has is it only deal with the old framework because
https://github.com/LLNL/Adagio
I checked the openmpi changelog. From openmpi 1.3, it began to switch to
a new framework, and openmpi 1.4+ has different one too. This library only
works under openmpi 1.2.
Thank you for your advise, I will try it. My current problem is this
library seems to try to patch mpi.h file, but it fails during the patching
process for new version openmpi. I don't know the reason yet, and will
check it soon. Thank you.
Kaiming Ouyang, Research Assistant.
Department of Computer Science and Engineering
University of California, Riverside
900 University Avenue, Riverside, CA 92521
On Tue, Mar 20, 2018 at 4:35 AM, Jeff Squyres (jsquyres) <
Post by Kaiming Ouyang
Post by Kaiming Ouyang
Thank you.
I am using newest version HPL.
I forgot to say I can run HPL with openmpi-3.0 under infiniband. The
reason I want to use old version is I need to compile a library that only
supports old version openmpi, so I am trying to do this tricky job.
Gotcha.
Is there something in particular about the old library that requires Open MPI v1.2.x?
More specifically: is there a particular error you get when you try to
use Open MPI v3.0.0 with that library?
I ask because if the app supports the MPI API in Open MPI v1.2.9, then
it also supports the MPI API in Open MPI v3.0.0. We *have* changed lots of
- how those MPI API's are implemented
- mpirun (and friends) command line parameters
- MCA parameters
- compilation flags
But many of those things might actually be mostly -- if not entirely --
hidden from a library that uses MPI.
My point: it may be easier to get your library to use a newer version of
Open MPI than you think. For example, if the library has some hard-coded
flags in their configure/Makefile to build with Open MPI, just replace
those flags with `mpicc --showme:BLAH` variants (see `mpicc --showme:help`
for a full listing). This will have Open MPI tell you exactly what flags
it needs to compile, link, etc.
--
Jeff Squyres
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
John Hearns via users
2018-03-21 07:23:07 UTC
Permalink
Kaiming, good luck with your project. I think you should contact Barry
Rountree directly. you will probably get good advice!

It is worth saying that with Turboboost there is variation between each
individual CPU die, even within the same SKU.
What Turboboost does is to set a thermal envelope, and the CPU core(s) ramp
up in frequency till the thermal limit is reached.
So each CPU die is slightly different (*)
Indeed in my last job we had a benchmarking exercise where the instruction
was to explicitly turn off Turboboost.


(*) As I work at ASML I really should understand this better... I really
should.
Post by Kaiming Ouyang
Hi John,
Thank you for your advice. But this is only related to its functionality,
and right now my problem is it cannot compile with new version openmpi.
The reason may come from its patch file since it needs to intercept MPI
calls to profile some data. New version openmpi may change its framework so
that this old software does not fit it anymore.
Kaiming Ouyang, Research Assistant.
Department of Computer Science and Engineering
University of California, Riverside
900 University Avenue, Riverside, CA 92521
On Tue, Mar 20, 2018 at 10:46 AM, John Hearns via users <
Post by John Hearns via users
"It does not handle more recent improvements such as Intel's turbo
mode and the processor performance inhomogeneity that comes with it."
I guess it is easy enough to disable Turbo mode in the BIOS though.
Post by Kaiming Ouyang
I think the problem it has is it only deal with the old
framework because it will intercept MPI calls and do some profiling. Here
https://github.com/LLNL/Adagio
I checked the openmpi changelog. From openmpi 1.3, it began to switch to
a new framework, and openmpi 1.4+ has different one too. This library only
works under openmpi 1.2.
Thank you for your advise, I will try it. My current problem is this
library seems to try to patch mpi.h file, but it fails during the patching
process for new version openmpi. I don't know the reason yet, and will
check it soon. Thank you.
Kaiming Ouyang, Research Assistant.
Department of Computer Science and Engineering
University of California, Riverside
900 University Avenue, Riverside, CA 92521
On Tue, Mar 20, 2018 at 4:35 AM, Jeff Squyres (jsquyres) <
Post by Kaiming Ouyang
Post by Kaiming Ouyang
Thank you.
I am using newest version HPL.
I forgot to say I can run HPL with openmpi-3.0 under infiniband. The
reason I want to use old version is I need to compile a library that only
supports old version openmpi, so I am trying to do this tricky job.
Gotcha.
Is there something in particular about the old library that requires Open MPI v1.2.x?
More specifically: is there a particular error you get when you try to
use Open MPI v3.0.0 with that library?
I ask because if the app supports the MPI API in Open MPI v1.2.9, then
it also supports the MPI API in Open MPI v3.0.0. We *have* changed lots of
- how those MPI API's are implemented
- mpirun (and friends) command line parameters
- MCA parameters
- compilation flags
But many of those things might actually be mostly -- if not entirely --
hidden from a library that uses MPI.
My point: it may be easier to get your library to use a newer version
of Open MPI than you think. For example, if the library has some
hard-coded flags in their configure/Makefile to build with Open MPI, just
replace those flags with `mpicc --showme:BLAH` variants (see `mpicc
--showme:help` for a full listing). This will have Open MPI tell you
exactly what flags it needs to compile, link, etc.
--
Jeff Squyres
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Jeff Squyres (jsquyres)
2018-03-21 12:24:11 UTC
Permalink
You might want to take that library author's advice from their README:

-----
The source code herein was used as the basis of Rountree ICS 2009. It
was my first nontrivial MPI tool and was never intended to be released
to the wider world. I beleive it was tied rather tightly to a subset
of a (now) old MPI implementation. I expect a nontrivial amount of
work would have to be done to get this to compile and run again, and
that effort would probably be better served starting from scratch
(using Todd Gamblin's wrap.py PMPI shim generator, for example).
-----
Kaiming, good luck with your project. I think you should contact Barry Rountree directly. you will probably get good advice!
It is worth saying that with Turboboost there is variation between each individual CPU die, even within the same SKU.
What Turboboost does is to set a thermal envelope, and the CPU core(s) ramp up in frequency till the thermal limit is reached.
So each CPU die is slightly different (*)
Indeed in my last job we had a benchmarking exercise where the instruction was to explicitly turn off Turboboost.
(*) As I work at ASML I really should understand this better... I really should.
Hi John,
Thank you for your advice. But this is only related to its functionality, and right now my problem is it cannot compile with new version openmpi.
The reason may come from its patch file since it needs to intercept MPI calls to profile some data. New version openmpi may change its framework so that this old software does not fit it anymore.
Kaiming Ouyang, Research Assistant.
Department of Computer Science and Engineering
University of California, Riverside
900 University Avenue, Riverside, CA 92521
"It does not handle more recent improvements such as Intel's turbo
mode and the processor performance inhomogeneity that comes with it."
I guess it is easy enough to disable Turbo mode in the BIOS though.
https://github.com/LLNL/Adagio
I checked the openmpi changelog. From openmpi 1.3, it began to switch to a new framework, and openmpi 1.4+ has different one too. This library only works under openmpi 1.2.
Thank you for your advise, I will try it. My current problem is this library seems to try to patch mpi.h file, but it fails during the patching process for new version openmpi. I don't know the reason yet, and will check it soon. Thank you.
Kaiming Ouyang, Research Assistant.
Department of Computer Science and Engineering
University of California, Riverside
900 University Avenue, Riverside, CA 92521
Post by Kaiming Ouyang
Thank you.
I am using newest version HPL.
I forgot to say I can run HPL with openmpi-3.0 under infiniband. The reason I want to use old version is I need to compile a library that only supports old version openmpi, so I am trying to do this tricky job.
Gotcha.
Is there something in particular about the old library that requires Open MPI v1.2.x?
More specifically: is there a particular error you get when you try to use Open MPI v3.0.0 with that library?
- how those MPI API's are implemented
- mpirun (and friends) command line parameters
- MCA parameters
- compilation flags
But many of those things might actually be mostly -- if not entirely -- hidden from a library that uses MPI.
My point: it may be easier to get your library to use a newer version of Open MPI than you think. For example, if the library has some hard-coded flags in their configure/Makefile to build with Open MPI, just replace those flags with `mpicc --showme:BLAH` variants (see `mpicc --showme:help` for a full listing). This will have Open MPI tell you exactly what flags it needs to compile, link, etc.
--
Jeff Squyres
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
--
Jeff Squyres
***@cisco.com
Kaiming Ouyang
2018-03-21 16:27:17 UTC
Permalink
Hi Jeff,
Thank you for your advice. I will contact the author for some suggestions.
I also notice I may port this old version library to new openmpi 3.0. I
will work on this soon. Thank you.

Kaiming Ouyang, Research Assistant.
Department of Computer Science and Engineering
University of California, Riverside
900 University Avenue, Riverside, CA 92521
Post by Jeff Squyres (jsquyres)
-----
The source code herein was used as the basis of Rountree ICS 2009. It
was my first nontrivial MPI tool and was never intended to be released
to the wider world. I beleive it was tied rather tightly to a subset
of a (now) old MPI implementation. I expect a nontrivial amount of
work would have to be done to get this to compile and run again, and
that effort would probably be better served starting from scratch
(using Todd Gamblin's wrap.py PMPI shim generator, for example).
-----
On Mar 21, 2018, at 2:23 AM, John Hearns via users <
Kaiming, good luck with your project. I think you should contact Barry
Rountree directly. you will probably get good advice!
It is worth saying that with Turboboost there is variation between each
individual CPU die, even within the same SKU.
What Turboboost does is to set a thermal envelope, and the CPU core(s)
ramp up in frequency till the thermal limit is reached.
So each CPU die is slightly different (*)
Indeed in my last job we had a benchmarking exercise where the
instruction was to explicitly turn off Turboboost.
(*) As I work at ASML I really should understand this better... I really
should.
Hi John,
Thank you for your advice. But this is only related to its
functionality, and right now my problem is it cannot compile with new
version openmpi.
The reason may come from its patch file since it needs to intercept MPI
calls to profile some data. New version openmpi may change its framework so
that this old software does not fit it anymore.
Kaiming Ouyang, Research Assistant.
Department of Computer Science and Engineering
University of California, Riverside
900 University Avenue, Riverside, CA 92521
On Tue, Mar 20, 2018 at 10:46 AM, John Hearns via users <
"It does not handle more recent improvements such as Intel's turbo
mode and the processor performance inhomogeneity that comes with it."
I guess it is easy enough to disable Turbo mode in the BIOS though.
I think the problem it has is it only deal with the old framework
because it will intercept MPI calls and do some profiling. Here is the
https://github.com/LLNL/Adagio
I checked the openmpi changelog. From openmpi 1.3, it began to switch to
a new framework, and openmpi 1.4+ has different one too. This library only
works under openmpi 1.2.
Thank you for your advise, I will try it. My current problem is this
library seems to try to patch mpi.h file, but it fails during the patching
process for new version openmpi. I don't know the reason yet, and will
check it soon. Thank you.
Kaiming Ouyang, Research Assistant.
Department of Computer Science and Engineering
University of California, Riverside
900 University Avenue, Riverside, CA 92521
On Tue, Mar 20, 2018 at 4:35 AM, Jeff Squyres (jsquyres) <
Post by Kaiming Ouyang
Thank you.
I am using newest version HPL.
I forgot to say I can run HPL with openmpi-3.0 under infiniband. The
reason I want to use old version is I need to compile a library that only
supports old version openmpi, so I am trying to do this tricky job.
Gotcha.
Is there something in particular about the old library that requires
Open MPI v1.2.x?
More specifically: is there a particular error you get when you try to
use Open MPI v3.0.0 with that library?
I ask because if the app supports the MPI API in Open MPI v1.2.9, then
it also supports the MPI API in Open MPI v3.0.0. We *have* changed lots of
- how those MPI API's are implemented
- mpirun (and friends) command line parameters
- MCA parameters
- compilation flags
But many of those things might actually be mostly -- if not entirely --
hidden from a library that uses MPI.
My point: it may be easier to get your library to use a newer version of
Open MPI than you think. For example, if the library has some hard-coded
flags in their configure/Makefile to build with Open MPI, just replace
those flags with `mpicc --showme:BLAH` variants (see `mpicc --showme:help`
for a full listing). This will have Open MPI tell you exactly what flags
it needs to compile, link, etc.
--
Jeff Squyres
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
--
Jeff Squyres
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Dave Love
2018-03-29 11:04:56 UTC
Permalink
Post by Kaiming Ouyang
Hi Jeff,
Thank you for your advice. I will contact the author for some suggestions.
I also notice I may port this old version library to new openmpi 3.0. I
will work on this soon. Thank you.
I haven't used them, but at least the profiling part, and possibly
control, should be covered by plugins at <https://github.com/score-p/>.
(Score-P is the replacement for the vampirtrace instrumentation included
with openmpi until recently; I think the vampirtrace plugin interface is
compatible with score-p's.)

Loading...