Discussion:
[OMPI users] OpenMPI 1.10.5 oversubscribing cores
t***@goodyear.com
2017-09-08 13:46:41 UTC
Permalink
I posted this question last year and we ended up not upgrading to the newer
openmpi. Now I need to change to openmpi 1.10.5 and have the same issue.

Specifically, using 1.4.2, I can run two 12 core jobs on a 24 core node and the
processes would bind to cores and only have 1 process per core. ie not
oversubscribe.

What I used with 1.4.2 was:
mpirun --mca mpi_paffinity_alone 1 --mca btl openib,tcp,sm,self ...

Now with 1.10.5, I have tried multiple combinations of map-to core, bind-to core
etc and cannot run 2 jobs on the same node without oversubcribing.

Is there a solution to this?

Thanks for any info
tom
r***@open-mpi.org
2017-09-08 14:10:23 UTC
Permalink
What you probably want to do is add --cpu-list a,b,c... to each mpirun command, where each one lists the cores you want to assign to that job.
Post by t***@goodyear.com
I posted this question last year and we ended up not upgrading to the newer
openmpi. Now I need to change to openmpi 1.10.5 and have the same issue.
Specifically, using 1.4.2, I can run two 12 core jobs on a 24 core node and the
processes would bind to cores and only have 1 process per core. ie not
oversubscribe.
mpirun --mca mpi_paffinity_alone 1 --mca btl openib,tcp,sm,self ...
Now with 1.10.5, I have tried multiple combinations of map-to core, bind-to core
etc and cannot run 2 jobs on the same node without oversubcribing.
Is there a solution to this?
Thanks for any info
tom
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Jeff Squyres (jsquyres)
2017-09-08 14:23:12 UTC
Permalink
Tom --

If you're going to upgrade, can you upgrade to the latest Open MPI (2.1.1)? I.e., unless you have a reason for wanting to stay back at an already-old version, you might as well upgrade to the latest latest latest to give you the longest shelf life.

I mention this because we are immanently about to release Open MPI v3.0.0, which will knock v1.10.x into the "unsupported" category (i.e., we'll be supporting v3.0.x, v2.1.x, and v2.0.x).

All that being said: if you need to stay back at the 1.10 series for some reason, you should update to the latest 1.10.x: v1.10.7 (not v1.10.5).
Post by r***@open-mpi.org
What you probably want to do is add --cpu-list a,b,c... to each mpirun command, where each one lists the cores you want to assign to that job.
Post by t***@goodyear.com
I posted this question last year and we ended up not upgrading to the newer
openmpi. Now I need to change to openmpi 1.10.5 and have the same issue.
Specifically, using 1.4.2, I can run two 12 core jobs on a 24 core node and the
processes would bind to cores and only have 1 process per core. ie not
oversubscribe.
mpirun --mca mpi_paffinity_alone 1 --mca btl openib,tcp,sm,self ...
Now with 1.10.5, I have tried multiple combinations of map-to core, bind-to core
etc and cannot run 2 jobs on the same node without oversubcribing.
Is there a solution to this?
Thanks for any info
tom
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
--
Jeff Squyres
***@cisco.com
Loading...