Discussion:
[OMPI users] About my GPU performance using Openmpi-2.0.4
Phanikumar Pentyala
2017-12-12 05:39:50 UTC
Permalink
Dear users and developers,

Currently I am using two Tesla K40m cards for my computational work on
quantum espresso (QE) suit http://www.quantum-espresso.org/. My GPU enabled
QE code running very slower than normal version. When I am submitting my
job on gpu it was showing some error that "A high-performance Open MPI
point-to-point messaging module was unable to find any relevant network
interfaces:

Module: OpenFabrics (openib)
Host: qmel

Another transport will be used instead, although this may result in
lower performance.

Is this the reason for diminishing GPU performance ??

I done installation by

1. ./configure --prefix=/home/xxxx/software/openmpi-2.0.4
--disable-openib-dynamic-sl --disable-openib-udcm --disable-openib-rdmacm"
because we don't have any Infiband adapter HCA in server.

2. make all

3. make install

Please correct me If I done any mistake in my installation or I have to use
Infiband adaptor for using Openmpi??

I read lot of posts in openmpi forum to remove above error while submitting
job, I added tag of "--mca btl ^openib" , still no use error vanished but
performance was same.

Current details of server are:

Server: FUJITSU PRIMERGY RX2540 M2
CUDA version: 9.0
openmpi version: 2.0.4 with intel mkl libraries
QE-gpu version (my application): 5.4.0

P.S: Extra information attached

Thanks in advance

Regards
Phanikumar
Research scholar
IIT Kharagpur
Kharagpur, westbengal
India
Howard Pritchard
2017-12-13 15:01:28 UTC
Permalink
Hi Phanikumar

It’s unlikely the warning message you are seeing is related to GPU
performance. Have you tried adding

—with-verbs=no

to your config line? That should quash openib complaint.

Howard
Post by Phanikumar Pentyala
Dear users and developers,
Currently I am using two Tesla K40m cards for my computational work on
quantum espresso (QE) suit http://www.quantum-espresso.org/. My GPU
enabled QE code running very slower than normal version. When I am
submitting my job on gpu it was showing some error that "A high-performance
Open MPI point-to-point messaging module was unable to find any relevant
Module: OpenFabrics (openib)
Host: qmel
Another transport will be used instead, although this may result in
lower performance.
Is this the reason for diminishing GPU performance ??
I done installation by
1. ./configure --prefix=/home/xxxx/software/openmpi-2.0.4
--disable-openib-dynamic-sl --disable-openib-udcm --disable-openib-rdmacm"
because we don't have any Infiband adapter HCA in server.
2. make all
3. make install
Please correct me If I done any mistake in my installation or I have to
use Infiband adaptor for using Openmpi??
I read lot of posts in openmpi forum to remove above error while
submitting job, I added tag of "--mca btl ^openib" , still no use error
vanished but performance was same.
Server: FUJITSU PRIMERGY RX2540 M2
CUDA version: 9.0
openmpi version: 2.0.4 with intel mkl libraries
QE-gpu version (my application): 5.4.0
P.S: Extra information attached
Thanks in advance
Regards
Phanikumar
Research scholar
IIT Kharagpur
Kharagpur, westbengal
India
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Mahmood Naderan
2017-12-13 17:04:52 UTC
Permalink
Post by Phanikumar Pentyala
Currently I am using two Tesla K40m cards for my computational work on
quantum espresso (QE) suit http://www.quantum-espresso.org/. My GPU
enabled QE code running very slower than normal version
Hi,
When I hear such words, I would say, yeah it is quite natural!

My personal experience with a GPU (Quadro M2000) was actually a
failure and loss of money. With various models, configs and companies,
it is very hard to determine if a GPU product really boosts the
performance unless you sign a contract with them (have to pay them!)
and consult their experts to find a good product.

At the end of the day, I think companies put all good features in
their high-end products (multi thousand dollar ones). So, I think the
K40m version, where it uses passive cooling, misses many good features
although it has 12GB of GDDR5.

I hope that in your case, the slow run is a software issue. That was
my thoughts only and may not be correct!

Regards,
Mahmood
Peter Kjellström
2017-12-14 08:22:15 UTC
Permalink
On Wed, 13 Dec 2017 20:34:52 +0330
Post by Mahmood Naderan
Post by Phanikumar Pentyala
Currently I am using two Tesla K40m cards for my computational work
on quantum espresso (QE) suit http://www.quantum-espresso.org/. My
GPU enabled QE code running very slower than normal version
Hi,
When I hear such words, I would say, yeah it is quite natural!
My personal experience with a GPU (Quadro M2000) was actually a
failure and loss of money. With various models, configs and companies,
it is very hard to determine if a GPU product really boosts the
performance
Agreed. GPU performance is not a given. It depends on app, version,
input files, hardware, job-geometry, ..
Post by Mahmood Naderan
At the end of the day, I think companies put all good features in
their high-end products (multi thousand dollar ones). So, I think the
K40m version, where it uses passive cooling, misses many good features
although it has 12GB of GDDR5.
K40m is the very high end (of the previous generation, Kepler). The
only higher speced GPU is the K80 which is just two slightly less
impressive K40 in one package.

As far as "passively cooled" goes. It's a server component where the
server is expected to provide the needed airflow. The K40m is a high
TDP part.

/Peter
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
Phanikumar Pentyala
2017-12-14 03:37:38 UTC
Permalink
​Thnkyou Howard for suggestion
​
I didn't try that option in configure. I will try this now. Is these
options "-disable-openib-dynamic-sl --disable-openib-udcm
--disable-openib-rdmacm" necessary while doing configure ??


Phanikumar
Post by Howard Pritchard
Hi Phanikumar
It?s unlikely the warning message you are seeing is related to GPU
performance. Have you tried adding
?with-verbs=no
to your config line? That should quash openib complaint.
Howard
Post by Phanikumar Pentyala
Dear users and developers,
Currently I am using two Tesla K40m cards for my computational work on
quantum espresso (QE) suit http://www.quantum-espresso.org/. My GPU
enabled QE code running very slower than normal version. When I am
submitting my job on gpu it was showing some error that "A
high-performance
Post by Phanikumar Pentyala
Open MPI point-to-point messaging module was unable to find any relevant
Module: OpenFabrics (openib)
Host: qmel
Another transport will be used instead, although this may result in
lower performance.
Is this the reason for diminishing GPU performance ??
I done installation by
1. ./configure --prefix=/home/xxxx/software/openmpi-2.0.4
--disable-openib-dynamic-sl --disable-openib-udcm
--disable-openib-rdmacm"
Post by Phanikumar Pentyala
because we don't have any Infiband adapter HCA in server.
2. make all
3. make install
Please correct me If I done any mistake in my installation or I have to
use Infiband adaptor for using Openmpi??
I read lot of posts in openmpi forum to remove above error while
submitting job, I added tag of "--mca btl ^openib" , still no use error
vanished but performance was same.
Server: FUJITSU PRIMERGY RX2540 M2
CUDA version: 9.0
openmpi version: 2.0.4 with intel mkl libraries
QE-gpu version (my application): 5.4.0
P.S: Extra information attached
Thanks in advance
Regards
Phanikumar
Research scholar
IIT Kharagpur
Kharagpur, westbengal
India
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.open-mpi.org/mailman/private/users/attachment
s/20171213/97221304/attachment.html>
Loading...