Discussion:
[OMPI users] How to Build OpenMPI to support FDR over SR-IOV
Pharthiphan Asokan
2018-03-13 15:30:24 UTC
Permalink
Hi,

I'm using Mellanox 56G FDR with SRIOV on KVM virtualization, and I want to use the RDMA to communicate between VM with FDR Virtual Function.


* Operating system/version: CentsOS 7.3
* Computer hardware: KVM Virtualization
* Network type: 56G FDR -- Virtual Function
* OpenMPI Version - Open MPI
Build Openmpi

wget https://www.open-mpi.org/software/ompi/v3.0/downloads/openmpi-3.0.0.tar.gz
tar -zxf openmpi-3.0.0.tar.gz
mv openmpi-3.0.0 openmpi-3.0.0-src
mkdir openmpi-3.0.0
./configure --prefix=/mnt/lustre_client/pasokan/openmpi-3.0.0/openmpi-3.0.0
make all install

Exporting OpenMPI Variables

# export PATH=/mnt/lustre_client/pasokan/openmpi-3.0.0/openmpi-3.0.0/bin:$PATH
# export LD_LIBRARY_PATH=/mnt/lustre_client/pasokan/openmpi-3.0.0/openmpi-3.0.0/lib:$LD_LIBRARY_PATH
# export INCLUDE=/mnt/lustre_client/pasokan/openmpi-3.0.0/openmpi-3.0.0/include:$INCLUDE

# which mpirun
/mnt/lustre_client/pasokan/openmpi-3.0.0/openmpi-3.0.0/bin/mpirun
# which mpicc
/mnt/lustre_client/pasokan/openmpi-3.0.0/openmpi-3.0.0/bin/mpicc

# cd /mnt/lustre_client/pasokan/
# mpicc /mnt/lustre_client/pasokan/mpi_hello_world.c

# ./a.out
--------------------------------------------------------------------------
WARNING: No preset parameters were found for the device that Open MPI
detected:

Local host: vcn03
Device name: mlx5_0
Device vendor ID: 0x02c9
Device vendor part ID: 4114

Default device parameters will be used, which may result in lower
performance. You can edit any of the files specified by the
btl_openib_device_param_files MCA parameter to set values for your
device.

NOTE: You can turn off this warning by setting the MCA parameter
btl_openib_warn_no_device_params_found to 0.
--------------------------------------------------------------------------
[vcn03][[34710,1],0][connect/btl_openib_connect_udcm.c:1235:udcm_rc_qp_to_rtr] error modifing QP to RTR errno says Invalid argument
Hello world from processor vcn03, rank 0 out of 1 processors
#

Compiling IOR using Openmpi on SR-IOV

[***@vcn03 IOR-July12]# cd src/C
[***@vcn03 C]# gmake posix mpiio
mpicc -o IOR IOR.o utilities.o parse_options.o \
aiori-POSIX.o aiori-noMPIIO.o aiori-noHDF5.o aiori-noNCMPI.o \
-lm
mpicc -o IOR IOR.o utilities.o parse_options.o \
aiori-POSIX.o aiori-MPIIO.o aiori-noHDF5.o aiori-noNCMPI.o \
-lm
[***@vcn03 C]# ./IOR
--------------------------------------------------------------------------
WARNING: No preset parameters were found for the device that Open MPI
detected:

Local host: vcn03
Device name: mlx5_0
Device vendor ID: 0x02c9
Device vendor part ID: 4114

Default device parameters will be used, which may result in lower
performance. You can edit any of the files specified by the
btl_openib_device_param_files MCA parameter to set values for your
device.

NOTE: You can turn off this warning by setting the MCA parameter
btl_openib_warn_no_device_params_found to 0.
--------------------------------------------------------------------------
[vcn03][[34753,1],0][connect/btl_openib_connect_udcm.c:1235:udcm_rc_qp_to_rtr] error modifing QP to RTR errno says Invalid argument
Segmentation fault
[***@vcn03 C]#


[***@vcn03 C]# mpirun --allow-run-as-root -np 2 -host vcn03,vcn04 hostname
bash: orted: command not found
--------------------------------------------------------------------------
ORTE was unable to reliably start one or more daemons.
This usually is caused by:

* not finding the required libraries and/or binaries on
one or more nodes. Please check your PATH and LD_LIBRARY_PATH
settings, or configure OMPI with --enable-orterun-prefix-by-default

* lack of authority to execute on one or more specified nodes.
Please verify your allocation and authorities.

* the inability to write startup files into /tmp (--tmpdir/orte_tmpdir_base).
Please check with your sys admin to determine the correct location to use.

* compilation of the orted with dynamic libraries when static are required
(e.g., on Cray). Please check your configure cmd line and consider using
one of the contrib/platform definitions for your system type.

* an inability to create a connection back to mpirun due to a
lack of common network interfaces and/or no route found between
them. Please check network connectivity (including firewalls
and network routing requirements).

Passwordless SSH between two systems are configured

[***@vcn03 C]# ssh vcn04
Last login: Mon Mar 12 12:03:42 2018 from vcn03
[***@vcn04 ~]# ssh vcn03
Last login: Tue Mar 13 01:56:46 2018 from pime6-01.ime.md.ddn.com
[***@vcn03 ~]#

Please help, need the procedure to build OpenMPI to support FDR over SR-IOV + KVM
Jeff Squyres (jsquyres)
2018-03-13 15:59:58 UTC
Permalink
Pharthiphan --

No need to cross-post the same question in three places (GitHub issue, this list, and the devel list).

Let's keep the thread on the devel list, where the first parts of your questions have already been answered.

Thanks.
Post by Pharthiphan Asokan
Hi,
I'm using Mellanox 56G FDR with SRIOV on KVM virtualization, and I want to use the RDMA to communicate between VM with FDR Virtual Function.
• Operating system/version: CentsOS 7.3
• Computer hardware: KVM Virtualization
• Network type: 56G FDR -- Virtual Function
• OpenMPI Version - Open MPI
Build Openmpi
wget https://www.open-mpi.org/software/ompi/v3.0/downloads/openmpi-3.0.0.tar.gz
tar -zxf openmpi-3.0.0.tar.gz
mv openmpi-3.0.0 openmpi-3.0.0-src
mkdir openmpi-3.0.0
./configure --prefix=/mnt/lustre_client/pasokan/openmpi-3.0.0/openmpi-3.0.0
make all install
Exporting OpenMPI Variables
# export PATH=/mnt/lustre_client/pasokan/openmpi-3.0.0/openmpi-3.0.0/bin:$PATH
# export LD_LIBRARY_PATH=/mnt/lustre_client/pasokan/openmpi-3.0.0/openmpi-3.0.0/lib:$LD_LIBRARY_PATH
# export INCLUDE=/mnt/lustre_client/pasokan/openmpi-3.0.0/openmpi-3.0.0/include:$INCLUDE
# which mpirun
/mnt/lustre_client/pasokan/openmpi-3.0.0/openmpi-3.0.0/bin/mpirun
# which mpicc
/mnt/lustre_client/pasokan/openmpi-3.0.0/openmpi-3.0.0/bin/mpicc
# cd /mnt/lustre_client/pasokan/
# mpicc /mnt/lustre_client/pasokan/mpi_hello_world.c
# ./a.out
--------------------------------------------------------------------------
WARNING: No preset parameters were found for the device that Open MPI
Local host: vcn03
Device name: mlx5_0
Device vendor ID: 0x02c9
Device vendor part ID: 4114
Default device parameters will be used, which may result in lower
performance. You can edit any of the files specified by the
btl_openib_device_param_files MCA parameter to set values for your
device.
NOTE: You can turn off this warning by setting the MCA parameter
btl_openib_warn_no_device_params_found to 0.
--------------------------------------------------------------------------
[vcn03][[34710,1],0][connect/btl_openib_connect_udcm.c:1235:udcm_rc_qp_to_rtr] error modifing QP to RTR errno says Invalid argument
Hello world from processor vcn03, rank 0 out of 1 processors
#
Compiling IOR using Openmpi on SR-IOV
mpicc -o IOR IOR.o utilities.o parse_options.o \
aiori-POSIX.o aiori-noMPIIO.o aiori-noHDF5.o aiori-noNCMPI.o \
-lm
mpicc -o IOR IOR.o utilities.o parse_options.o \
aiori-POSIX.o aiori-MPIIO.o aiori-noHDF5.o aiori-noNCMPI.o \
-lm
--------------------------------------------------------------------------
WARNING: No preset parameters were found for the device that Open MPI
Local host: vcn03
Device name: mlx5_0
Device vendor ID: 0x02c9
Device vendor part ID: 4114
Default device parameters will be used, which may result in lower
performance. You can edit any of the files specified by the
btl_openib_device_param_files MCA parameter to set values for your
device.
NOTE: You can turn off this warning by setting the MCA parameter
btl_openib_warn_no_device_params_found to 0.
--------------------------------------------------------------------------
[vcn03][[34753,1],0][connect/btl_openib_connect_udcm.c:1235:udcm_rc_qp_to_rtr] error modifing QP to RTR errno says Invalid argument
Segmentation fault
bash: orted: command not found
--------------------------------------------------------------------------
ORTE was unable to reliably start one or more daemons.
* not finding the required libraries and/or binaries on
one or more nodes. Please check your PATH and LD_LIBRARY_PATH
settings, or configure OMPI with --enable-orterun-prefix-by-default
* lack of authority to execute on one or more specified nodes.
Please verify your allocation and authorities.
* the inability to write startup files into /tmp (--tmpdir/orte_tmpdir_base).
Please check with your sys admin to determine the correct location to use.
* compilation of the orted with dynamic libraries when static are required
(e.g., on Cray). Please check your configure cmd line and consider using
one of the contrib/platform definitions for your system type.
* an inability to create a connection back to mpirun due to a
lack of common network interfaces and/or no route found between
them. Please check network connectivity (including firewalls
and network routing requirements).
Passwordless SSH between two systems are configured
Last login: Mon Mar 12 12:03:42 2018 from vcn03
Last login: Tue Mar 13 01:56:46 2018 from pime6-01.ime.md.ddn.com
Please help, need the procedure to build OpenMPI to support FDR over SR-IOV + KVM
<configure.out><make_all_install.out>_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
--
Jeff Squyres
***@cisco.com

Loading...