Discussion:
[OMPI users] --map-by
Noam Bernstein
2017-11-16 14:44:46 UTC
Permalink
Hi all - I’m trying to run mixed MPI/OpenMP, so I ideally want binding of each MPI process to a small set of cores (to allow for the OpenMP threads). From the mpirun docs at
https://www.open-mpi.org//doc/current/man1/mpirun.1.php <https://www.open-mpi.org//doc/current/man1/mpirun.1.php>
I got the example that I thought corresponded to what I want,
% mpirun ... --map-by core:PE=2 --bind-to core
So I tried
mpirun -x OMP_NUM_THREADS --map-by core:PE=4 --bind-to core -np 32 python 
..

However, when I run this (with openmpi 3.0.0 or with 1.8.8) I get the following error:
A request for multiple cpus-per-proc was given, but a directive
was also give to map to an object level that cannot support that
directive.

Please specify a mapping level that has more than one cpu, or
else let us define a default mapping that will allow multiple
cpus-per-proc.

Am I doing something wrong, or is there a mistake in the docs, and it should bind to something other than core?

thanks,
Noam
r***@open-mpi.org
2017-11-16 14:49:14 UTC
Permalink
Do not include the “bind-to core” option.the mapping directive already forces that

Sent from my iPad
Post by Noam Bernstein
Hi all - I’m trying to run mixed MPI/OpenMP, so I ideally want binding of each MPI process to a small set of cores (to allow for the OpenMP threads). From the mpirun docs at
https://www.open-mpi.org//doc/current/man1/mpirun.1.php
I got the example that I thought corresponded to what I want,
% mpirun ... --map-by core:PE=2 --bind-to core
So I tried
mpirun -x OMP_NUM_THREADS --map-by core:PE=4 --bind-to core -np 32 python 
..
A request for multiple cpus-per-proc was given, but a directive
was also give to map to an object level that cannot support that
directive.
Please specify a mapping level that has more than one cpu, or
else let us define a default mapping that will allow multiple
cpus-per-proc.
Am I doing something wrong, or is there a mistake in the docs, and it should bind to something other than core?
thanks,
Noam
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Noam Bernstein
2017-11-16 15:08:02 UTC
Permalink
Post by r***@open-mpi.org
Do not include the “bind-to core” option.the mapping directive already forces that
Same error message, unfortunately. And no, I’m not setting a global binding policy, as far as I can tell:

env | grep OMPI_MCA
OMPI_MCA_hwloc_base_report_bindings=1
[compute-7-6:15083] SETTING BINDING TO CORE
--------------------------------------------------------------------------
A request for multiple cpus-per-proc was given, but a directive
was also give to map to an object level that cannot support that
directive.

Please specify a mapping level that has more than one cpu, or
else let us define a default mapping that will allow multiple
cpus-per-proc.
--------------------------------------------------------------------------

Noam


____________
||
|U.S. NAVAL|
|_RESEARCH_|
LABORATORY
Noam Bernstein, Ph.D.
Center for Materials Physics and Technology
U.S. Naval Research Laboratory
T +1 202 404 8628 F +1 202 404 7546
https://www.nrl.navy.mil <https://www.nrl.navy.mil/>
r***@open-mpi.org
2017-11-21 00:02:00 UTC
Permalink
So there are two options here that will work and hopefully provide you with the desired pattern:

* if you want the procs to go in different NUMA regions:
$ mpirun --map-by numa:PE=2 --report-bindings -n 2 /bin/true
[rhc001:131460] MCW rank 0 bound to socket 0[core 0[hwt 0-1]], socket 0[core 1[hwt 0-1]]: [BB/BB/../../../../../../../../../..][../../../../../../../../../../../..]
[rhc001:131460] MCW rank 1 bound to socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]]: [../../../../../../../../../../../..][BB/BB/../../../../../../../../../..]

* if you want the procs to go in the same NUMA region:
$ mpirun --map-by ppr:2:numa:PE=2 --report-bindings -n 2 /bin/true
[rhc001:131559] MCW rank 0 bound to socket 0[core 0[hwt 0-1]], socket 0[core 1[hwt 0-1]]: [BB/BB/../../../../../../../../../..][../../../../../../../../../../../..]
[rhc001:131559] MCW rank 1 bound to socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]]: [../../BB/BB/../../../../../../../..][../../../../../../../../../../../..]

Reason: the level you are mapping by (e.g., NUMA) must have enough cores in it to meet your PE=N directive. If you map by core, then there is only one core in that object.

HTH
Ralph
Post by Noam Bernstein
Post by r***@open-mpi.org
Do not include the “bind-to core” option.the mapping directive already forces that
env | grep OMPI_MCA
OMPI_MCA_hwloc_base_report_bindings=1
[compute-7-6:15083] SETTING BINDING TO CORE
--------------------------------------------------------------------------
A request for multiple cpus-per-proc was given, but a directive
was also give to map to an object level that cannot support that
directive.
Please specify a mapping level that has more than one cpu, or
else let us define a default mapping that will allow multiple
cpus-per-proc.
--------------------------------------------------------------------------
Noam
____________
||
|U.S. NAVAL|
|_RESEARCH_|
LABORATORY
Noam Bernstein, Ph.D.
Center for Materials Physics and Technology
U.S. Naval Research Laboratory
T +1 202 404 8628 F +1 202 404 7546
https://www.nrl.navy.mil <https://www.nrl.navy.mil/>
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Noam Bernstein
2017-11-21 13:34:42 UTC
Permalink
Post by r***@open-mpi.org
$ mpirun --map-by numa:PE=2 --report-bindings -n 2 /bin/true
[rhc001:131460] MCW rank 0 bound to socket 0[core 0[hwt 0-1]], socket 0[core 1[hwt 0-1]]: [BB/BB/../../../../../../../../../..][../../../../../../../../../../../..]
[rhc001:131460] MCW rank 1 bound to socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]]: [../../../../../../../../../../../..][BB/BB/../../../../../../../../../..]
$ mpirun --map-by ppr:2:numa:PE=2 --report-bindings -n 2 /bin/true
[rhc001:131559] MCW rank 0 bound to socket 0[core 0[hwt 0-1]], socket 0[core 1[hwt 0-1]]: [BB/BB/../../../../../../../../../..][../../../../../../../../../../../..]
[rhc001:131559] MCW rank 1 bound to socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]]: [../../BB/BB/../../../../../../../..][../../../../../../../../../../../..]
Reason: the level you are mapping by (e.g., NUMA) must have enough cores in it to meet your PE=N directive. If you map by core, then there is only one core in that object.
Makes sense. I’ll try that. However, if I understand your explanation correctly the docs should probably be changed, because they seem to be suggesting something that will never work. In fact, would the ":PE=N" > 1 ever work for "—map-by core”? I guess maybe if you have hyperthreading on, but I’d still argue that that’s an unhelpful example, given how rarely hyperthreading is used in HPC.

Noam




____________
||
|U.S. NAVAL|
|_RESEARCH_|
LABORATORY
Noam Bernstein, Ph.D.
Center for Materials Physics and Technology
U.S. Naval Research Laboratory
T +1 202 404 8628 F +1 202 404 7546
https://www.nrl.navy.mil <https://www.nrl.navy.mil/>
r***@open-mpi.org
2017-11-21 13:53:07 UTC
Permalink
I believe that map-by core with a PE > 1 may have worked at some point in the past, but the docs should probably be looked at. I took a (very brief) look at the code and re-enabling that particular option would be difficult and not really necessary since one can reproduce the desired pattern within the current context.
Post by Noam Bernstein
Post by r***@open-mpi.org
$ mpirun --map-by numa:PE=2 --report-bindings -n 2 /bin/true
[rhc001:131460] MCW rank 0 bound to socket 0[core 0[hwt 0-1]], socket 0[core 1[hwt 0-1]]: [BB/BB/../../../../../../../../../..][../../../../../../../../../../../..]
[rhc001:131460] MCW rank 1 bound to socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]]: [../../../../../../../../../../../..][BB/BB/../../../../../../../../../..]
$ mpirun --map-by ppr:2:numa:PE=2 --report-bindings -n 2 /bin/true
[rhc001:131559] MCW rank 0 bound to socket 0[core 0[hwt 0-1]], socket 0[core 1[hwt 0-1]]: [BB/BB/../../../../../../../../../..][../../../../../../../../../../../..]
[rhc001:131559] MCW rank 1 bound to socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]]: [../../BB/BB/../../../../../../../..][../../../../../../../../../../../..]
Reason: the level you are mapping by (e.g., NUMA) must have enough cores in it to meet your PE=N directive. If you map by core, then there is only one core in that object.
Makes sense. I’ll try that. However, if I understand your explanation correctly the docs should probably be changed, because they seem to be suggesting something that will never work. In fact, would the ":PE=N" > 1 ever work for "—map-by core”? I guess maybe if you have hyperthreading on, but I’d still argue that that’s an unhelpful example, given how rarely hyperthreading is used in HPC.
Noam
____________
||
|U.S. NAVAL|
|_RESEARCH_|
LABORATORY
Noam Bernstein, Ph.D.
Center for Materials Physics and Technology
U.S. Naval Research Laboratory
T +1 202 404 8628 F +1 202 404 7546
https://www.nrl.navy.mil <https://www.nrl.navy.mil/>
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Noam Bernstein
2017-11-27 15:05:41 UTC
Permalink
Post by Noam Bernstein
Post by r***@open-mpi.org
$ mpirun --map-by numa:PE=2 --report-bindings -n 2 /bin/true
[rhc001:131460] MCW rank 0 bound to socket 0[core 0[hwt 0-1]], socket 0[core 1[hwt 0-1]]: [BB/BB/../../../../../../../../../..][../../../../../../../../../../../..]
[rhc001:131460] MCW rank 1 bound to socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]]: [../../../../../../../../../../../..][BB/BB/../../../../../../../../../..]
$ mpirun --map-by ppr:2:numa:PE=2 --report-bindings -n 2 /bin/true
[rhc001:131559] MCW rank 0 bound to socket 0[core 0[hwt 0-1]], socket 0[core 1[hwt 0-1]]: [BB/BB/../../../../../../../../../..][../../../../../../../../../../../..]
[rhc001:131559] MCW rank 1 bound to socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]]: [../../BB/BB/../../../../../../../..][../../../../../../../../../../../..]
Reason: the level you are mapping by (e.g., NUMA) must have enough cores in it to meet your PE=N directive. If you map by core, then there is only one core in that object.
Makes sense. I’ll try that. However, if I understand your explanation correctly the docs should probably be changed, because they seem to be suggesting something that will never work. In fact, would the ":PE=N" > 1 ever work for "—map-by core”? I guess maybe if you have hyperthreading on, but I’d still argue that that’s an unhelpful example, given how rarely hyperthreading is used in HPC.
I can now confirm that "—map-by numa:PE=2" does indeed work, and seems to give good performance.

thanks,
Noam

Loading...