[OMPI users] MPI advantages over PBS

Discussion:

Diego Avesani

2018-08-22 15:49:55 UTC

Dear all,

I have a philosophical question.

I am reading a lot of papers where people use Portable Batch System or job
scheduler in order to parallelize their code.

What are the advantages in using MPI instead?

I am writing a report on my code, where of course I use openMPI. So tell me
please how can I cite you. You deserve all the credits.

Thanks a lot,
Thanks again,

Diego

Reuti

2018-08-25 19:28:17 UTC

Permalink

Hi,

Post by Diego Avesani
Dear all,
I have a philosophical question.
I am reading a lot of papers where people use Portable Batch System or job scheduler in order to parallelize their code.

To parallelize their code? I would more phrase it: "Using a scheduler allows to simultaneously execute compiled applications in a cluster without overloading the nodes; being it serial or parallel applications which are executed several times at the same time or one after the other as available cores permit".

Any batch scheduler (like PBS) and any parallel library (like MPI in any implementation [like Open MPI]) don't compete, but cover different aspects of running jobs in a cluster.

There are even problems, where you don't gain anything by parallelizing the code: think of the task of rendering 5000 images or apply certain effects to them: if the code is perfectly parallel and cuts the execution time by half with each doubling of the number of cores: the overall time to get the final result of all images stays the same. Essentially in such a situation you can execute several serial instances of an application at the same time in a cluster, which might be referred to "running in parallel" – but depending on the context such a statement might be ambiguous.

But if you need the result of the first image resp. computation to decide how to proceed, then it's advantageous to parallelize the application on its own instead.

-- Reuti

Post by Diego Avesani
What are the advantages in using MPI instead?
I am writing a report on my code, where of course I use openMPI. So tell me please how can I cite you. You deserve all the credits.
Thanks a lot,
Thanks again,
Diego
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users

John Hearns via users

2018-08-25 07:06:57 UTC

Permalink

Diego,
I am sorry but you have different things here. PBS is a resource allocation
system. It will reserve the use of a compute server, or several compute
servers, for you to run your parallel job on. PBS can launch the MPI job -
there are several mechanisms for launching parallel jobs.
MPI is an API for parallel programming. I would rather say a library, but
if I'm not wrong MPI is a standard for parallel programming and is
technically an API.

One piece of advice I would have is that you can run MPI programs from the
command line. So Google for 'Hello World MPI'. Write your first MPI program
then use mpirun from the command line.

If you have a cluster which has the PBS batch system you can then use PBS
to run your MPI program.
IF that is not clear please let us know what help you need.

Post by Diego Avesani
Dear all,
I have a philosophical question.
I am reading a lot of papers where people use Portable Batch System or job
scheduler in order to parallelize their code.
What are the advantages in using MPI instead?
I am writing a report on my code, where of course I use openMPI. So tell
me please how can I cite you. You deserve all the credits.
Thanks a lot,
Thanks again,
Diego
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users

Gustavo Correa

2018-08-25 18:41:36 UTC

Permalink

Hi Diego

I (still) have Torque/PBS version 4.something in old clusters.
[Most people at this point already switched to Slurm.]

Torque/PBS comes with a tool named "pbsdsh" (for PBS distributed shell):

http://docs.adaptivecomputing.com/torque/4-1-3/Content/topics/commands/pbsdsh.htm
https://wikis.nyu.edu/display/NYUHPC/PBSDSH
http://www.ep.ph.bham.ac.uk/general/support/torquepbsdsh.html

"pbsdsh" is able to launch *serial* jobs across the nodes that you allocate for the job.
This allows you to run so called "embarrassingly parallel" tasks:

https://en.wikipedia.org/wiki/Embarrassingly_parallel

Ebarrassingly parallel tasks (or programs) are those where the processes don't
need to communicate with each other (therefore would be feasible without MPI).

The link above lists some embarrassingly parallel tasks.
A trivial example would be, for instance, to run transliterate all uppercase letters
to lowercase in a large number of text files, i.e. to run the "tr" command below

tr '[:upper:]' '[:lower:]' < input${i}.txt > output{$i}.txt

where the file name index ${i} would be distributed across the various processes.
"pbsdsh" can do it without MPI, because there is no need for one processor to
communicate with another processor to perform this.

However, pbsdsh is not unique at all.
There are other tools, independent of PBS, which can do the same if not more than pbsdsh.
"Pdsh" is one of them (probably the most versatile and popular):

https://github.com/chaos/pdsh
https://linux.die.net/man/1/pdsh
http://www.linux-magazine.com/Issues/2014/166/Parallel-Shells
https://www.rittmanmead.com/blog/2014/12/linux-cluster-sysadmin-parallel-command-execution-with-pdsh/

[Some examples in the links above are "parallel system administration tasks", some are user-level tasks.]

***

*However ... not all parallelizable problems are embarrassingly parallel!!!*
Actually, most are not embarrassingly parallel.
A whole very large class of common problems in science in engineering,
is to solve partial differential equations (PDE) using finite-differences (FD) or similar methods
(finite elements, finite volume, pseudo-spectral, etc) through domain decomposition.
This is a typical example of a problem that is parallelizable, but not embarrassingly parallel.
When solving PDE's with FD through domain decomposition, you have to exchange
the data on the sub-domain halos across the processors in charge of solving the
equation on each subdomain. This requires communication across the processors,
something that pbsdsh or pdsh cannot do, but MPI can (and so did the predecessors of
MPI: p4, PVM, etc).
This class of problems includes most of computational fluid dynamics, structural mechanics,
weather forecast, climate/oceans/atmosphere modeling, geodynamics, etc.
Many problem solving methods in molecular dynamics and computational chemistry are not
embarrassingly parallel either.
There are many other classes of parallelizable problems that are not embarrassingly parallel.

Ian Foster's book, although somewhat out of date in several aspects,
still provides some background on this:

https://pdfs.semanticscholar.org/09ed/7308fdfb0b640077328aa4fd10ce429f511a.pdf

[Anybody in the list can suggest a recent book that with this type of
comprehensive approach to parallel programs, please?
The ones I know are restricted to MPI, or OpenMP, and so on.]

Do you know what type of problem you're trying to solve,
and whether it is embarrassingly parallel or not?
Which type of problem are you trying to solve?

I hope this helps,
Gus Correa

Post by John Hearns via users
Diego,
I am sorry but you have different things here. PBS is a resource allocation system. It will reserve the use of a compute server, or several compute servers, for you to run your parallel job on. PBS can launch the MPI job - there are several mechanisms for launching parallel jobs.
MPI is an API for parallel programming. I would rather say a library, but if I'm not wrong MPI is a standard for parallel programming and is technically an API.
One piece of advice I would have is that you can run MPI programs from the command line. So Google for 'Hello World MPI'. Write your first MPI program then use mpirun from the command line.
If you have a cluster which has the PBS batch system you can then use PBS to run your MPI program.
IF that is not clear please let us know what help you need.
Dear all,
I have a philosophical question.
I am reading a lot of papers where people use Portable Batch System or job scheduler in order to parallelize their code.
What are the advantages in using MPI instead?
I am writing a report on my code, where of course I use openMPI. So tell me please how can I cite you. You deserve all the credits.
Thanks a lot,
Thanks again,
Diego
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users

Gustavo Correa

2018-08-26 00:33:53 UTC

Permalink

The message may not matter,
but somehow two copies of this sent earlier today didn't make it to the list.
Gus Correa

Subject: Re: [OMPI users] MPI advantages over PBS
Date: August 25, 2018 at 14:41:36 EDT
Hi Diego
I (still) have Torque/PBS version 4.something in old clusters.
[Most people at this point already switched to Slurm.]
http://docs.adaptivecomputing.com/torque/4-1-3/Content/topics/commands/pbsdsh.htm
https://wikis.nyu.edu/display/NYUHPC/PBSDSH
http://www.ep.ph.bham.ac.uk/general/support/torquepbsdsh.html
"pbsdsh" is able to launch *serial* jobs across the nodes that you allocate for the job.
https://en.wikipedia.org/wiki/Embarrassingly_parallel
Ebarrassingly parallel tasks (or programs) are those where the processes don't
need to communicate with each other (therefore would be feasible without MPI).
The link above lists some embarrassingly parallel tasks.
A trivial example would be, for instance, to run transliterate all uppercase letters
to lowercase in a large number of text files, i.e. to run the "tr" command below
tr '[:upper:]' '[:lower:]' < input${i}.txt > output{$i}.txt
where the file name index ${i} would be distributed across the various processes.
"pbsdsh" can do it without MPI, because there is no need for one processor to
communicate with another processor to perform this.
However, pbsdsh is not unique at all.
There are other tools, independent of PBS, which can do the same if not more than pbsdsh.
https://github.com/chaos/pdsh
https://linux.die.net/man/1/pdsh
http://www.linux-magazine.com/Issues/2014/166/Parallel-Shells
https://www.rittmanmead.com/blog/2014/12/linux-cluster-sysadmin-parallel-command-execution-with-pdsh/
[Some examples in the links above are "parallel system administration tasks", some are user-level tasks.]
***
*However ... not all parallelizable problems are embarrassingly parallel!!!*
Actually, most are not embarrassingly parallel.
A whole very large class of common problems in science in engineering,
is to solve partial differential equations (PDE) using finite-differences (FD) or similar methods
(finite elements, finite volume, pseudo-spectral, etc) through domain decomposition.
This is a typical example of a problem that is parallelizable, but not embarrassingly parallel.
When solving PDE's with FD through domain decomposition, you have to exchange
the data on the sub-domain halos across the processors in charge of solving the
equation on each subdomain. This requires communication across the processors,
something that pbsdsh or pdsh cannot do, but MPI can (and so did the predecessors of
MPI: p4, PVM, etc).
This class of problems includes most of computational fluid dynamics, structural mechanics,
weather forecast, climate/oceans/atmosphere modeling, geodynamics, etc.
Many problem solving methods in molecular dynamics and computational chemistry are not
embarrassingly parallel either.
There are many other classes of parallelizable problems that are not embarrassingly parallel.
Ian Foster's book, although somewhat out of date in several aspects,
https://pdfs.semanticscholar.org/09ed/7308fdfb0b640077328aa4fd10ce429f511a.pdf
[Anybody in the list can suggest a recent book that with this type of
comprehensive approach to parallel programs, please?
The ones I know are restricted to MPI, or OpenMP, and so on.]
Do you know what type of problem you're trying to solve,
and whether it is embarrassingly parallel or not?
Which type of problem are you trying to solve?
I hope this helps,
Gus Correa

Gustavo Correa

2018-08-26 17:06:18 UTC

Permalink

Hi

The message below may not matter much.
It was my two cents attempt to help and try to clarify for Diego the difference between
MPI and simpler solutions to "embarrassingly parallel" problems.

Somehow a few copies of this message that I sent yesterday never made it to the list,
or were not relayed to the list subscribers.

Thank you,
Gus Correa

Diego Avesani

2018-08-26 10:02:25 UTC

Permalink

Dear all,

thank you for your answers. I will try to explain better my situation.
I have written a code and I have parallelized it with openMPI. In
particular I have a two level palatalization. The first takes care of a
parallel code program and the second run the parallel code with different
input in order to get the best solution. In the second level the different
runs and output have to communicate in order to define the best solution
and to modify accordingly the input data. These communications have to take
place different times in the all simulation.

I have read some papers where some people do that with PBS or Microsoft
job scheduling.
I opted for openMPI.

What do you think? Can you give me reasons supporting my decision?

Thanks

Diego

On Sun, 26 Aug 2018 at 00:53, John Hearns via users <

Post by Diego Avesani
Dear all,
I have a philosophical question.
I am reading a lot of papers where people use Portable Batch System or
job scheduler in order to parallelize their code.
What are the advantages in using MPI instead?
I am writing a report on my code, where of course I use openMPI. So tell
me please how can I cite you. You deserve all the credits.
Thanks a lot,
Thanks again,
Diego
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users

_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users

Patrick Begou

2018-08-30 12:37:52 UTC

Permalink

Diego,

what you wont to do is parametric studies. There are specific software available
to do this efficiently (ie reducing the number of runs). Software can then rely
in a job scheduler (PBS, SLURM...) which can launch many parallel mpi
applications at the same time depending on the results of previous runs.
Look at :
- Dakota https://dakota.sandia.gov/ (open source)
- Modefrontier https://www.esteco.com/modefrontier (commercial)

Patrick

Post by Diego Avesani
Dear all,
thank you for your answers. I will try to explain better my situation.
I have written a code and I have parallelized it with openMPI. In particular I
have a two level palatalization. The first takes care of a parallel code
program and the second run the parallel code with different input in order to
get the best solution. In the second level the different runs and output have
to communicate in order to define the best solution and to modify accordingly
the input data. These communications have to take place different times in the
all simulation.
I have read some papers where some people do that with PBS or Microsoft job
scheduling.
I opted for openMPI.
What do you think? Can you give me reasons supporting my decision?
Thanks
Diego
Diego,
I am sorry but you have different things here. PBS is a resource
allocation system. It will reserve the use of a compute server, or several
compute servers, for you to run your parallel job on. PBS can launch the
MPI job - there are several mechanisms for launching parallel jobs.
MPI is an API for parallel programming. I would rather say a library, but
if I'm not wrong MPI is a standard for parallel programming and is
technically an API.
One piece of advice I would have is that you can run MPI programs from the
command line. So Google for 'Hello World MPI'. Write your first MPI
program then use mpirun from the command line.
If you have a cluster which has the PBS batch system you can then use PBS
to run your MPI program.
IF that is not clear please let us know what help you need.
Dear all,
I have a philosophical question.
I am reading a lot of papers where people use Portable Batch System or
job scheduler in order to parallelize their code.
What are the advantages in using MPI instead?
I am writing a report on my code, where of course I use openMPI. So
tell me please how can I cite you. You deserve all the credits.
Thanks a lot,
Thanks again,
Diego
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users

--
===================================================================
| Equipe M.O.S.T. | |
| Patrick BEGOU | mailto:***@grenoble-inp.fr |
| LEGI | |
| BP 53 X | Tel 04 76 82 51 35 |
| 38041 GRENOBLE CEDEX | Fax 04 76 82 52 71 |
===================================================================

Diego Avesani

2018-08-31 11:51:56 UTC

Permalink

Dear all,

thanks a lot. So basically the idea is that MPI is more useful when a lot
of communication occurs in the calibrator. On the other hand, the jobs
scheduler is more appropriate when there are no or few communications.

Is my final statement correct?
Thanks a lot

Diego

On Thu, 30 Aug 2018 at 14:55, Patrick Begou <

Post by John Hearns via users
Diego,
what you wont to do is parametric studies. There are specific software
available to do this efficiently (ie reducing the number of runs). Software
can then rely in a job scheduler (PBS, SLURM...) which can launch many
parallel mpi applications at the same time depending on the results of
previous runs.
- Dakota https://dakota.sandia.gov/ (open source)
- Modefrontier https://www.esteco.com/modefrontier (commercial)
Patrick
Dear all,
thank you for your answers. I will try to explain better my situation.
I have written a code and I have parallelized it with openMPI. In
particular I have a two level palatalization. The first takes care of a
parallel code program and the second run the parallel code with different
input in order to get the best solution. In the second level the different
runs and output have to communicate in order to define the best solution
and to modify accordingly the input data. These communications have to take
place different times in the all simulation.
I have read some papers where some people do that with PBS or Microsoft
job scheduling.
I opted for openMPI.
What do you think? Can you give me reasons supporting my decision?
Thanks
Diego
On Sun, 26 Aug 2018 at 00:53, John Hearns via users <

Post by Diego Avesani
Dear all,
I have a philosophical question.
I am reading a lot of papers where people use Portable Batch System or
job scheduler in order to parallelize their code.
What are the advantages in using MPI instead?
I am writing a report on my code, where of course I use openMPI. So tell
me please how can I cite you. You deserve all the credits.
Thanks a lot,
Thanks again,
Diego
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users

_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users

_______________________________________________
--
===================================================================
| Equipe M.O.S.T. | |
| LEGI | |
| BP 53 X | Tel 04 76 82 51 35 |
| 38041 GRENOBLE CEDEX | Fax 04 76 82 52 71 |
===================================================================

Reuti

2018-09-05 09:10:26 UTC

Permalink

Hi,

Post by Diego Avesani
Dear all,
thanks a lot. So basically the idea is that MPI is more useful when a lot of communication occurs in the calibrator. On the other hand, the jobs scheduler is more appropriate when there are no or few communications.
Is my final statement correct?

In my opinion: no.

A job scheduler can serialize the workflow and run one job after the other as free resources provide. Their usage may overlap in certain cases, but MPI and a job scheduler don't compete.

-- Reuti

Post by Diego Avesani
Thanks a lot
Diego
Diego,
what you wont to do is parametric studies. There are specific software available to do this efficiently (ie reducing the number of runs). Software can then rely in a job scheduler (PBS, SLURM...) which can launch many parallel mpi applications at the same time depending on the results of previous runs.
- Dakota https://dakota.sandia.gov/ (open source)
- Modefrontier https://www.esteco.com/modefrontier (commercial)
Patrick

Post by Diego Avesani
Dear all,
thank you for your answers. I will try to explain better my situation.
I have written a code and I have parallelized it with openMPI. In particular I have a two level palatalization. The first takes care of a parallel code program and the second run the parallel code with different input in order to get the best solution. In the second level the different runs and output have to communicate in order to define the best solution and to modify accordingly the input data. These communications have to take place different times in the all simulation.
I have read some papers where some people do that with PBS or Microsoft job scheduling.
I opted for openMPI.
What do you think? Can you give me reasons supporting my decision?
Thanks
Diego
Diego,
I am sorry but you have different things here. PBS is a resource allocation system. It will reserve the use of a compute server, or several compute servers, for you to run your parallel job on. PBS can launch the MPI job - there are several mechanisms for launching parallel jobs.
MPI is an API for parallel programming. I would rather say a library, but if I'm not wrong MPI is a standard for parallel programming and is technically an API.
One piece of advice I would have is that you can run MPI programs from the command line. So Google for 'Hello World MPI'. Write your first MPI program then use mpirun from the command line.
If you have a cluster which has the PBS batch system you can then use PBS to run your MPI program.
IF that is not clear please let us know what help you need.
Dear all,
I have a philosophical question.
I am reading a lot of papers where people use Portable Batch System or job scheduler in order to parallelize their code.
What are the advantages in using MPI instead?
I am writing a report on my code, where of course I use openMPI. So tell me please how can I cite you. You deserve all the credits.
Thanks a lot,
Thanks again,
Diego
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users

--
===================================================================
| Equipe M.O.S.T. | |
| Patrick BEGOU |
|
| LEGI | |
| BP 53 X | Tel 04 76 82 51 35 |
| 38041 GRENOBLE CEDEX | Fax 04 76 82 52 71 |
===================================================================
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users

Gustavo Correa

2018-08-26 23:08:37 UTC

Permalink

Jeff Squyres (jsquyres) via users

2018-08-25 12:51:48 UTC

Permalink

Post by Diego Avesani
I have a philosophical question.
I am reading a lot of papers where people use Portable Batch System or job scheduler in order to parallelize their code.
What are the advantages in using MPI instead?

It depends on the code in question / problem being solved.

Embarrassingly parallel problems may lend themselves to submitting a bazillion serial jobs through a job scheduler such as SLURM, Torque, ...etc. (there's other factors that matter, too, such as job length, IO requirements, ...etc.). That is, these serial jobs can run independently of each other, and therefore can run whenever / wherever the scheduler puts them. MPI is not necessary because the jobs don't need to communicate with each other.

If there's coordination needed during the compute, however, MPI is helpful because the individual processes can directly communicate (vs. run a job, output the results to stable storage, then fire up the next job to read in those results from stable storage, ...etc.).

These requirements are all on a multi-variable spectrum, of course. In most cases, it's clear whether you should choose Monte Carlo-style bazillion-serial-jobs-through-a-scheduler vs. MPI. But there's definitely some cases where a closer examination of requirements is required to determine which would be better (to potentially include a hybrid solution).

Post by Diego Avesani
I am writing a report on my code, where of course I use openMPI. So tell me please how can I cite you. You deserve all the credits.

Please see https://www.open-mpi.org/papers/. Thanks!

--
Jeff Squyres
***@cisco.com

Gustavo Correa

2018-08-26 00:16:49 UTC

Permalink

Gustavo Correa

2018-08-26 00:25:25 UTC

Permalink

Somehow two copies of this sent earlier today didn't make it to the list.
Gus Correa

Subject: Re: [OMPI users] MPI advantages over PBS
Date: August 25, 2018 at 20:16:49 EDT
Hi Diego
I (still) have Torque/PBS version 4.something in old clusters.
[Most people at this point already switched to Slurm.]
http://docs.adaptivecomputing.com/torque/4-1-3/Content/topics/commands/pbsdsh.htm
https://wikis.nyu.edu/display/NYUHPC/PBSDSH
http://www.ep.ph.bham.ac.uk/general/support/torquepbsdsh.html
"pbsdsh" is able to launch *serial* jobs across the nodes that you allocate for the job.
https://en.wikipedia.org/wiki/Embarrassingly_parallel
Ebarrassingly parallel tasks (or programs) are those where the processes don't
need to communicate with each other (therefore would be feasible without MPI).
The link above lists some embarrassingly parallel tasks.
A trivial example would be, for instance, to run transliterate all uppercase letters
to lowercase in a large number of text files, i.e. to run the "tr" command below
tr '[:upper:]' '[:lower:]' < input${i}.txt > output{$i}.txt
where the file name index ${i} would be distributed across the various processes.
"pbsdsh" can do it without MPI, because there is no need for one processor to
communicate with another processor to perform this.
However, pbsdsh is not unique at all.
There are other tools, independent of PBS, which can do the same if not more than pbsdsh.
https://github.com/chaos/pdsh
https://linux.die.net/man/1/pdsh
http://www.linux-magazine.com/Issues/2014/166/Parallel-Shells
https://www.rittmanmead.com/blog/2014/12/linux-cluster-sysadmin-parallel-command-execution-with-pdsh/
[Some examples in the links above are "parallel system administration tasks", some are user-level tasks.]
***
*However ... not all parallelizable problems are embarrassingly parallel!!!*
Actually, most are not embarrassingly parallel.
A whole very large class of common problems in science in engineering,
is to solve partial differential equations (PDE) using finite-differences (FD) or similar methods
(finite elements, finite volume, pseudo-spectral, etc) through domain decomposition.
This is a typical example of a problem that is parallelizable, but not embarrassingly parallel.
When solving PDE's with FD through domain decomposition, you have to exchange
the data on the sub-domain halos across the processors in charge of solving the
equation on each subdomain. This requires communication across the processors,
something that pbsdsh or pdsh cannot do, but MPI can (and so did the predecessors of
MPI: p4, PVM, etc).
This class of problems includes most of computational fluid dynamics, structural mechanics,
weather forecast, climate/oceans/atmosphere modeling, geodynamics, etc.
Many problem solving methods in molecular dynamics and computational chemistry are not
embarrassingly parallel either.
There are many other classes of parallelizable problems that are not embarrassingly parallel.
Ian Foster's book, although somewhat out of date in several aspects,
https://pdfs.semanticscholar.org/09ed/7308fdfb0b640077328aa4fd10ce429f511a.pdf
[Anybody in the list can suggest a recent book that with this type of
comprehensive approach to parallel programs, please?
The ones I know are restricted to MPI, or OpenMP, and so on.]
Do you know what type of problem you're trying to solve,
and whether it is embarrassingly parallel or not?
Which type of problem are you trying to solve?
I hope this helps,
Gus Correa