Discussion:
[OMPI users] How is hwloc used by OpenMPI
Blosch, Edwin L
2012-11-07 18:33:20 UTC
Permalink
I see hwloc is a subproject hosted under OpenMPI but, in reading the documentation, I was unable to figure out if hwloc is a module within OpenMPI, or if some of the code base is borrowed into OpenMPI, or something else. Is hwloc used by OpenMPI internally? Is it a layer above libnuma? Or is it just a project that is useful to OpenMPI in support of targeting various new platforms?

Thanks
Jeff Squyres
2012-11-07 20:26:17 UTC
Permalink
Post by Blosch, Edwin L
I see hwloc is a subproject hosted under OpenMPI but, in reading the documentation, I was unable to figure out if hwloc is a module within OpenMPI, or if some of the code base is borrowed into OpenMPI, or something else. Is hwloc used by OpenMPI internally? Is it a layer above libnuma? Or is it just a project that is useful to OpenMPI in support of targeting various new platforms?
Open MPI uses hwloc internally for three main things:

1. all of the processor affinity options to mpirun (e.g., --bind-to-core)
2. all its internal memory affinity functionality
3. gather topology information about the machine it's running on

#3 isn't used too heavily yet -- that will be more developed over time (shared memory collectives have some obvious applications here). But we use it to know if processes are in the same NUMA domain, which OpenFabrics devices are "near" to a given process' NUMA domain, etc.

But hwloc also stands alone quite well; it actually has nothing to do with MPI. So it made sense to keep it as a standalone library+tool suite, too.
--
Jeff Squyres
***@cisco.com
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Brice Goglin
2012-11-08 00:19:46 UTC
Permalink
Post by Jeff Squyres
Post by Blosch, Edwin L
I see hwloc is a subproject hosted under OpenMPI but, in reading the documentation, I was unable to figure out if hwloc is a module within OpenMPI, or if some of the code base is borrowed into OpenMPI, or something else. Is hwloc used by OpenMPI internally? Is it a layer above libnuma? Or is it just a project that is useful to OpenMPI in support of targeting various new platforms?
1. all of the processor affinity options to mpirun (e.g., --bind-to-core)
2. all its internal memory affinity functionality
3. gather topology information about the machine it's running on
#3 isn't used too heavily yet -- that will be more developed over time (shared memory collectives have some obvious applications here). But we use it to know if processes are in the same NUMA domain, which OpenFabrics devices are "near" to a given process' NUMA domain, etc.
But hwloc also stands alone quite well; it actually has nothing to do with MPI. So it made sense to keep it as a standalone library+tool suite, too.
Edwin's question about libnuma also deserves an answer, and I need to
prepare my marketing material for SC next week :)

hwloc may somehow be considered as a layer above libnuma but:
* hwloc is more portable (works on non-NUMA and non-Linux platforms)
* hwloc does everything libnuma does, but it does a lot more (everything
that isn't related to NUMA)
* hwloc only uses libnuma for some syscalls (memory binding and
migration syscalls are not in the libc unfortunately). We don't use
anything else because we don't want to rely on their numa_*() interface
(they broke the ABI in the past, things are not well documented, and
their API is broken is some cases)

Brice
Jeff Squyres
2012-11-08 00:28:44 UTC
Permalink
Post by Brice Goglin
* hwloc does everything libnuma does, but it does a lot more (everything
that isn't related to NUMA)
Here's my 1-line description:

libnuma is old bustedness; hwloc is new hotness.

:-)
--
Jeff Squyres
***@cisco.com
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Blosch, Edwin L
2012-11-08 15:17:52 UTC
Permalink
Thanks, I definitely appreciate the new, hotness of hwloc. I just couldn't tell from the documentation or the web page how or if it was being used by OpenMPI.

I still work with OpenMPI 1.4.x and now that I've looked into the builds, I think I understand that PLPA is used in 1.4 and hwloc is brought in as an MCA module in 1.6.x.

Re: layering, I believe you are saying that the relationship to libnuma is not one where hwloc is adding higher-level functionalities to libnuma, but rather hwloc is a much improved alternative except for a few system calls it makes via libnuma out of necessity or convenience.

Thanks
Jeff Squyres
2012-11-08 16:07:18 UTC
Permalink
Post by Blosch, Edwin L
Thanks, I definitely appreciate the new, hotness of hwloc. I just couldn't tell from the documentation or the web page how or if it was being used by OpenMPI.
I still work with OpenMPI 1.4.x and now that I've looked into the builds, I think I understand that PLPA is used in 1.4 and hwloc is brought in as an MCA module in 1.6.x.
Correct. PLPA was a first attempt at a generic processor affinity solution. hwloc is a 2nd generation, much Much MUCH better solution than PLPA (we wholly killed PLPA after the INRIA guys designed hwloc).
Post by Blosch, Edwin L
Re: layering, I believe you are saying that the relationship to libnuma is not one where hwloc is adding higher-level functionalities to libnuma, but rather hwloc is a much improved alternative except for a few system calls it makes via libnuma out of necessity or convenience.
Correct.
--
Jeff Squyres
***@cisco.com
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Rayson Ho
2012-11-23 19:45:05 UTC
Permalink
Post by Jeff Squyres
Correct. PLPA was a first attempt at a generic processor affinity solution. hwloc is a 2nd generation, much Much MUCH better solution than PLPA (we wholly killed PLPA
after the INRIA guys designed hwloc).
Edwin,

We ported OGS/Grid Engine to hwloc 1.5 years ago (the original core
binding code in Grid Engine uses PLPA).

http://gridscheduler.sourceforge.net/projects/hwloc/GridEnginehwloc.html
Post by Jeff Squyres
From an API consumer (both PLPA & hwloc) point of view, some of the
important hwloc advantages are:

1) Grid Engine now can use the same piece of code for different
platforms: Linux, Solaris, AIX, Mac OS X, FreeBSD, Tru64, HP-UX,
Windows. Before with PLPA, we only have support for Linux & Solaris.

2) Support for newer CPU architectures & hardware. As the development
of PLPA stopped a few years ago, many of the newer architectures did
not get recognized properly. We switched over to hwloc when the
original Grid Engine core binding code stopped working on the AMD
Magny-Cours (Opteron 6100 series).

To be fair to PLPA, had the development continued, then it should have
no issues with those new architectures. But then, the data structures
of hwloc seem to be able to handle newer hardware components more
nicely!


We now use information from hwloc to optimize job placement on AMD
Bulldozers (including Piledriver). Currently hwloc just treats each of
the Bulldozer module as 2 cores, so we still have to code a bit of
logic in the Grid Engine code to do what we need.

http://blogs.scalablelogic.com/2012/07/optimizing-grid-engine-for-amd.html

Rayson

==================================================
Open Grid Scheduler - The Official Open Source Grid Engine
http://gridscheduler.sourceforge.net/
Post by Jeff Squyres
Post by Blosch, Edwin L
Re: layering, I believe you are saying that the relationship to libnuma is not one where hwloc is adding higher-level functionalities to libnuma, but rather hwloc is a much improved alternative except for a few system calls it makes via libnuma out of necessity or convenience.
Correct.
--
Jeff Squyres
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
_______________________________________________
users mailing list
http://www.open-mpi.org/mailman/listinfo.cgi/users
http://gridscheduler.sourceforge.net/GridEngine/GridEngineCloud.html
Loading...