Hammond, Simon David via users
2018-11-20 18:49:51 UTC
Hi OpenMPI Users,
I wonder if you can help us with a problem we are having when trying to force OpenMPI to use specific cores. We want to supply an initial CPU affinity list to mpirun and then have it select its appropriate binding from within that set. So for instance, to provide it with two cores and then have a bind-to/map-by core for two MPI processes. However, it doesn't appear that this works correctly with either OpenMPI 2.1.2 or 3.1.0 during our testing.
Example: POWER9 system with 32 cores running in SMT-4 mode.
$ numactl --physcpubind=4-63 mpirun -n 2 --map-by core --bind-to core ./maskprinter-ppc64
System has: 128 logical cores.
Rank 0 : CPU Mask for Process 0 (total of 4 logical cores out of a max of 128 cores)
Generating mask information...
1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
System has: 128 logical cores.
Rank : 1 CPU Mask for Process 0 (total of 4 logical cores out of a max of 128 cores)
Generating mask information...
0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
But we do get the correct affinity if we don't use mpirun for a single process:
$ numactl --physcpubind=4-63 ./maskprinter-ppc64
System has: 128 logical cores.
CPU Mask for Process 0 (total of 60 logical cores out of a max of 128 cores)
Generating mask information...
0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
In an ideal world, the above mpirun usage would shift the cores allocated to be within the 4-63 range.
Is this possible with OpenMPI at all? I realize this is a fairly unusal request.
Thanks,
S.
--
Si Hammond
Scalable Computer Architectures
Sandia National Laboratories, NM, USA
[Sent from remote connection, excuse typos]
I wonder if you can help us with a problem we are having when trying to force OpenMPI to use specific cores. We want to supply an initial CPU affinity list to mpirun and then have it select its appropriate binding from within that set. So for instance, to provide it with two cores and then have a bind-to/map-by core for two MPI processes. However, it doesn't appear that this works correctly with either OpenMPI 2.1.2 or 3.1.0 during our testing.
Example: POWER9 system with 32 cores running in SMT-4 mode.
$ numactl --physcpubind=4-63 mpirun -n 2 --map-by core --bind-to core ./maskprinter-ppc64
System has: 128 logical cores.
Rank 0 : CPU Mask for Process 0 (total of 4 logical cores out of a max of 128 cores)
Generating mask information...
1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
System has: 128 logical cores.
Rank : 1 CPU Mask for Process 0 (total of 4 logical cores out of a max of 128 cores)
Generating mask information...
0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
But we do get the correct affinity if we don't use mpirun for a single process:
$ numactl --physcpubind=4-63 ./maskprinter-ppc64
System has: 128 logical cores.
CPU Mask for Process 0 (total of 60 logical cores out of a max of 128 cores)
Generating mask information...
0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
In an ideal world, the above mpirun usage would shift the cores allocated to be within the 4-63 range.
Is this possible with OpenMPI at all? I realize this is a fairly unusal request.
Thanks,
S.
--
Si Hammond
Scalable Computer Architectures
Sandia National Laboratories, NM, USA
[Sent from remote connection, excuse typos]