[OMPI users] "-map-by socket:PE=1" doesn't do what I expect

Discussion:

Mark Dixon

2017-02-15 11:17:14 UTC

Hi,

When combining OpenMPI 2.0.2 with OpenMP, I'm interested in launching a
number of ranks and allocating a number of cores to each rank. Using
"-map-by socket:PE=<num>", switching to "-map-by node:PE=<num>" if I want
to allocate more than a single socket to a rank, seems to do what I want.

Except for "-map-by socket:PE=1". That seems to allocate an entire socket
to each rank instead of a single core. Here's the output of a test program
on a dual socket non-hyperthreading system that reports rank core bindings
(odd cores on one socket, even on the other):

$ mpirun -np 2 -map-by socket:PE=1 ./report_binding
Rank 0 bound somehost.somewhere: 0 2 4 6 8 10 12 14 16 18 20 22
Rank 1 bound somehost.somewhere: 1 3 5 7 9 11 13 15 17 19 21 23

$ mpirun -np 2 -map-by socket:PE=2 ./report_binding
Rank 0 bound somehost.somewhere: 0 2
Rank 1 bound somehost.somewhere: 1 3

$ mpirun -np 2 -map-by socket:PE=3 ./report_binding
Rank 0 bound somehost.somewhere: 0 2 4
Rank 1 bound somehost.somewhere: 1 3 5

$ mpirun -np 2 -map-by socket:PE=4 ./report_binding
Rank 0 bound somehost.somewhere: 0 2 4 6
Rank 1 bound somehost.somewhere: 1 3 5 7

I get the same result if I change "socket" to "numa". Changing "socket" to
either "core", "node" or "slot" binds each rank to a single core (good),
but doesn't round-robin ranks across sockets like "socket" does (bad).

Is "-map-by socket:PE=1" doing the right thing, please? I tried reading
the man page but I couldn't work out what the expected behaviour is :o

Cheers,

Mark

r***@open-mpi.org

2017-02-15 12:52:07 UTC

Permalink

Ah, yes - I know what the problem is. We weren’t expecting a PE value of 1 - the logic is looking expressly for values > 1 as we hadn’t anticipated this use-case.

I can make that change. I’m off to a workshop for the next day or so, but can probably do this on the plane.

Hi,
When combining OpenMPI 2.0.2 with OpenMP, I'm interested in launching a number of ranks and allocating a number of cores to each rank. Using "-map-by socket:PE=<num>", switching to "-map-by node:PE=<num>" if I want to allocate more than a single socket to a rank, seems to do what I want.
$ mpirun -np 2 -map-by socket:PE=1 ./report_binding
Rank 0 bound somehost.somewhere: 0 2 4 6 8 10 12 14 16 18 20 22
Rank 1 bound somehost.somewhere: 1 3 5 7 9 11 13 15 17 19 21 23
$ mpirun -np 2 -map-by socket:PE=2 ./report_binding
Rank 0 bound somehost.somewhere: 0 2
Rank 1 bound somehost.somewhere: 1 3
$ mpirun -np 2 -map-by socket:PE=3 ./report_binding
Rank 0 bound somehost.somewhere: 0 2 4
Rank 1 bound somehost.somewhere: 1 3 5
$ mpirun -np 2 -map-by socket:PE=4 ./report_binding
Rank 0 bound somehost.somewhere: 0 2 4 6
Rank 1 bound somehost.somewhere: 1 3 5 7
I get the same result if I change "socket" to "numa". Changing "socket" to either "core", "node" or "slot" binds each rank to a single core (good), but doesn't round-robin ranks across sockets like "socket" does (bad).
Is "-map-by socket:PE=1" doing the right thing, please? I tried reading the man page but I couldn't work out what the expected behaviour is :o
Cheers,
Mark
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

r***@open-mpi.org

2017-02-15 12:52:08 UTC

Permalink

Hi,
When combining OpenMPI 2.0.2 with OpenMP, I'm interested in launching a number of ranks and allocating a number of cores to each rank. Using "-map-by socket:PE=<num>", switching to "-map-by node:PE=<num>" if I want to allocate more than a single socket to a rank, seems to do what I want.
$ mpirun -np 2 -map-by socket:PE=1 ./report_binding
Rank 0 bound somehost.somewhere: 0 2 4 6 8 10 12 14 16 18 20 22
Rank 1 bound somehost.somewhere: 1 3 5 7 9 11 13 15 17 19 21 23
$ mpirun -np 2 -map-by socket:PE=2 ./report_binding
Rank 0 bound somehost.somewhere: 0 2
Rank 1 bound somehost.somewhere: 1 3
$ mpirun -np 2 -map-by socket:PE=3 ./report_binding
Rank 0 bound somehost.somewhere: 0 2 4
Rank 1 bound somehost.somewhere: 1 3 5
$ mpirun -np 2 -map-by socket:PE=4 ./report_binding
Rank 0 bound somehost.somewhere: 0 2 4 6
Rank 1 bound somehost.somewhere: 1 3 5 7
I get the same result if I change "socket" to "numa". Changing "socket" to either "core", "node" or "slot" binds each rank to a single core (good), but doesn't round-robin ranks across sockets like "socket" does (bad).
Is "-map-by socket:PE=1" doing the right thing, please? I tried reading the man page but I couldn't work out what the expected behaviour is :o
Cheers,
Mark
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Mark Dixon

2017-02-15 13:45:39 UTC

Permalink

Ah, yes - I know what the problem is. We weren¢t expecting a PE value of
1 - the logic is looking expressly for values > 1 as we hadn¢t
anticipated this use-case.

Is it a sensible use-case, or am I crazy?

I can make that change. I¢m off to a workshop for the next day or so,
but can probably do this on the plane.

You're a star - thanks :)

Mark

r***@open-mpi.org

2017-02-15 13:49:52 UTC

Permalink

Post by Mark Dixon

Post by r***@open-mpi.org
Ah, yes - I know what the problem is. We weren’t expecting a PE value of 1 - the logic is looking expressly for values > 1 as we hadn’t anticipated this use-case.

Is it a sensible use-case, or am I crazy?

Not crazy, I’d say. The expected way of doing it would be “--map-by socket --bind-to core”. However, I can see why someone might expect pe=1 to work.

Post by Mark Dixon

Post by r***@open-mpi.org
I can make that change. I’m off to a workshop for the next day or so, but can probably do this on the plane.

You're a star - thanks :)
Mark_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

r***@open-mpi.org

2017-02-17 15:42:06 UTC

Permalink

Mark - this is now available in master. Will look at what might be required to bring it to 2.0

Post by r***@open-mpi.org

Post by Mark Dixon

Post by r***@open-mpi.org
Ah, yes - I know what the problem is. We weren’t expecting a PE value of 1 - the logic is looking expressly for values > 1 as we hadn’t anticipated this use-case.

Is it a sensible use-case, or am I crazy?

Not crazy, I’d say. The expected way of doing it would be “--map-by socket --bind-to core”. However, I can see why someone might expect pe=1 to work.

Post by Mark Dixon

Post by r***@open-mpi.org
I can make that change. I’m off to a workshop for the next day or so, but can probably do this on the plane.

You're a star - thanks :)
Mark_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Mark Dixon

2017-02-17 16:04:18 UTC

Permalink

Post by r***@open-mpi.org
Mark - this is now available in master. Will look at what might be
required to bring it to 2.0

Thanks Ralph,

To be honest, since you've given me an alternative, there's no rush from
my point of view.

The logic's embedded in a script and it's been taught "--map-by socket
--bind-to core" for the special case of 1. It'd be nice to get rid of it
at some point, but there's no problem waiting for the next stable branch
:)

Cheers,

Mark