Hi Diego
I (still) have Torque/PBS version 4.something in old clusters.
[Most people at this point already switched to Slurm.]
Torque/PBS comes with a tool named "pbsdsh" (for PBS distributed shell):
http://docs.adaptivecomputing.com/torque/4-1-3/Content/topics/commands/pbsdsh.htm
https://wikis.nyu.edu/display/NYUHPC/PBSDSH
http://www.ep.ph.bham.ac.uk/general/support/torquepbsdsh.html
"pbsdsh" is able to launch *serial* jobs across the nodes that you allocate for the job.
This allows you to run so called "embarrassingly parallel" tasks:
https://en.wikipedia.org/wiki/Embarrassingly_parallel
Ebarrassingly parallel tasks (or programs) are those where the processes don't
need to communicate with each other (therefore would be feasible without MPI).
The link above lists some embarrassingly parallel tasks.
A trivial example would be, for instance, to run transliterate all uppercase letters
to lowercase in a large number of text files, i.e. to run the "tr" command below
tr '[:upper:]' '[:lower:]' < input${i}.txt > output{$i}.txt
where the file name index ${i} would be distributed across the various processes.
"pbsdsh" can do it without MPI, because there is no need for one processor to
communicate with another processor to perform this.
However, pbsdsh is not unique at all.
There are other tools, independent of PBS, which can do the same if not more than pbsdsh.
"Pdsh" is one of them (probably the most versatile and popular):
https://github.com/chaos/pdsh
https://linux.die.net/man/1/pdsh
http://www.linux-magazine.com/Issues/2014/166/Parallel-Shells
https://www.rittmanmead.com/blog/2014/12/linux-cluster-sysadmin-parallel-command-execution-with-pdsh/
[Some examples in the links above are "parallel system administration tasks", some are user-level tasks.]
***
*However ... not all parallelizable problems are embarrassingly parallel!!!*
Actually, most are not embarrassingly parallel.
A whole very large class of common problems in science in engineering,
is to solve partial differential equations (PDE) using finite-differences (FD) or similar methods
(finite elements, finite volume, pseudo-spectral, etc) through domain decomposition.
This is a typical example of a problem that is parallelizable, but not embarrassingly parallel.
When solving PDE's with FD through domain decomposition, you have to exchange
the data on the sub-domain halos across the processors in charge of solving the
equation on each subdomain. This requires communication across the processors,
something that pbsdsh or pdsh cannot do, but MPI can (and so did the predecessors of
MPI: p4, PVM, etc).
This class of problems includes most of computational fluid dynamics, structural mechanics,
weather forecast, climate/oceans/atmosphere modeling, geodynamics, etc.
Many problem solving methods in molecular dynamics and computational chemistry are not
embarrassingly parallel either.
There are many other classes of parallelizable problems that are not embarrassingly parallel.
Ian Foster's book, although somewhat out of date in several aspects,
still provides some background on this:
https://pdfs.semanticscholar.org/09ed/7308fdfb0b640077328aa4fd10ce429f511a.pdf
[Anybody in the list can suggest a recent book that with this type of
comprehensive approach to parallel programs, please?
The ones I know are restricted to MPI, or OpenMP, and so on.]
Do you know what type of problem you're trying to solve,
and whether it is embarrassingly parallel or not?
Which type of problem are you trying to solve?
I hope this helps,
Gus Correa
Post by John Hearns via usersDiego,
I am sorry but you have different things here. PBS is a resource allocation system. It will reserve the use of a compute server, or several compute servers, for you to run your parallel job on. PBS can launch the MPI job - there are several mechanisms for launching parallel jobs.
MPI is an API for parallel programming. I would rather say a library, but if I'm not wrong MPI is a standard for parallel programming and is technically an API.
One piece of advice I would have is that you can run MPI programs from the command line. So Google for 'Hello World MPI'. Write your first MPI program then use mpirun from the command line.
If you have a cluster which has the PBS batch system you can then use PBS to run your MPI program.
IF that is not clear please let us know what help you need.
Dear all,
I have a philosophical question.
I am reading a lot of papers where people use Portable Batch System or job scheduler in order to parallelize their code.
What are the advantages in using MPI instead?
I am writing a report on my code, where of course I use openMPI. So tell me please how can I cite you. You deserve all the credits.
Thanks a lot,
Thanks again,
Diego
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users