Discussion:
[OMPI users] Mimicking timeout for MPI_Wait
Katz, Jacob
2009-12-03 08:31:32 UTC
Permalink
Hi,
I wonder if there is a BKM (efficient and portable) to mimic a timeout with a call to MPI_Wait, i.e. to interrupt it once a given time period has passed if it hasn't returned by then yet.
I'll appreciate if anyone may send a pointer/idea.

Thanks.
--------------------------------
Jacob M. Katz | ***@intel.com<mailto:***@intel.com> | Work: +972-4-865-5726 | iNet: (8)-465-5726

---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
Jeff Squyres
2009-12-04 18:38:05 UTC
Permalink
I wonder if there is a BKM (efficient and portable) to mimic a timeout with a call to MPI_Wait, i.e. to interrupt it once a given time period has passed if it hasn’t returned by then yet.
Pardon my ignorance, but what does BKM stand for?

Open MPI does not currently implement a timeout-capable MPI_WAIT. Such functionality probably could be implemented (e.g., in the MPIX "experimental" namespace), especially since Open MPI polls for progress -- it could check a timer every so often while polling -- but no one has done so.
--
Jeff Squyres
***@cisco.com
Richard Treumann
2009-12-04 20:03:08 UTC
Permalink
If you are hoping for a return on timeout, almost zero CPU use while
waiting and fast response you will need to be pretty creative. Here is a
simple solution that may be OK if you do not need both fast response and
low CPU load.

flag = false;
for ( ; ! is_time_up(); )
MPI_Test( ........ &flag, ......);
if (flag) break;
usleep(..)
}

Make the sleep short or leave it out and you hog CPU, make it long and your
lag time for detecting a message that arrives after you enter the loop
will average 1/2 the sleep plus a bit.



Dick Treumann - MPI Team
IBM Systems & Technology Group
Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
Tele (845) 433-7846 Fax (845) 433-8363
[image removed]
Re: [OMPI users] Mimicking timeout for MPI_Wait
Jeff Squyres
Open MPI Users
12/04/2009 01:38 PM
Please respond to Open MPI Users
Post by Katz, Jacob
I wonder if there is a BKM (efficient and portable) to mimic a
timeout with a call to MPI_Wait, i.e. to interrupt it once a given
time period has passed if it hasn’t returned by then yet.
Pardon my ignorance, but what does BKM stand for?
Open MPI does not currently implement a timeout-capable MPI_WAIT.
Such functionality probably could be implemented (e.g., in the MPIX
"experimental" namespace), especially since Open MPI polls for
progress -- it could check a timer every so often while polling --
but no one has done so.
--
Jeff Squyres
_______________________________________________
users mailing list
http://www.open-mpi.org/mailman/listinfo.cgi/users
Katz, Jacob
2009-12-06 12:29:01 UTC
Permalink
Thanks.
Yes, I meant in the question that I was looking for something creative, both fast responding and not using 100% CPU all the time.
I guess I’m not the first one to face this question. Have anyone done anything “better” than the simple solution?
--------------------------------
Jacob M. Katz | ***@intel.com<mailto:***@intel.com> | Work: +972-4-865-5726 | iNet: (8)-465-5726

From: users-***@open-mpi.org [mailto:users-***@open-mpi.org] On Behalf Of Richard Treumann
Sent: Friday, December 04, 2009 22:03
To: Open MPI Users
Subject: Re: [OMPI users] Mimicking timeout for MPI_Wait


If you are hoping for a return on timeout, almost zero CPU use while waiting and fast response you will need to be pretty creative. Here is a simple solution that may be OK if you do not need both fast response and low CPU load.

flag = false;
for ( ; ! is_time_up(); )
MPI_Test( ........ &flag, ......);
if (flag) break;
usleep(..)
}

Make the sleep short or leave it out and you hog CPU, make it long and your lag time for detecting a message that arrives after you enter the loop will average 1/2 the sleep plus a bit.



Dick Treumann - MPI Team
IBM Systems & Technology Group
Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
Tele (845) 433-7846 Fax (845) 433-8363
[image removed]
Re: [OMPI users] Mimicking timeout for MPI_Wait
Jeff Squyres
Open MPI Users
12/04/2009 01:38 PM
Please respond to Open MPI Users
Post by Katz, Jacob
I wonder if there is a BKM (efficient and portable) to mimic a
timeout with a call to MPI_Wait, i.e. to interrupt it once a given
time period has passed if it hasn’t returned by then yet.
Pardon my ignorance, but what does BKM stand for?
Open MPI does not currently implement a timeout-capable MPI_WAIT.
Such functionality probably could be implemented (e.g., in the MPIX
"experimental" namespace), especially since Open MPI polls for
progress -- it could check a timer every so often while polling --
but no one has done so.
--
Jeff Squyres
_______________________________________________
users mailing list
http://www.open-mpi.org/mailman/listinfo.cgi/users
---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
Douglas Guptill
2009-12-06 13:52:51 UTC
Permalink
Post by Katz, Jacob
Thanks.
Yes, I meant in the question that I was looking for something creative, both fast responding and not using 100% CPU all the time.
I guess I’m not the first one to face this question. Have anyone done anything “better” than the simple solution?
My MPI application is a two-process thing, in which data is thrown
back and forth. For the most part, one process is calculating, and
the other is waiting.

I got tired of seeing both cpus at 100% load, and based on suggestions
from Jeff Squyres and Eugene Loh, wrote MPI_Recv.c and MPI_Send.c. I
load these with my application, and bingo! Only one cpu busy at any
given time.

They use a graduated sleep; the first sleep is short, the second is
twice as long, and so on, up to a maximum sleep time.

I sent the code along with my last message on the subject (December
2008, or later) so it should be in the archives. Failing that, I
could post it again, if anyone wants it.

Douglas.
Katz, Jacob
2009-12-06 15:42:26 UTC
Permalink
Thanks, Douglas.
I found your code in the archive.
--------------------------------
Jacob M. Katz | ***@intel.com | Work: +972-4-865-5726 | iNet: (8)-465-5726


-----Original Message-----
From: users-***@open-mpi.org [mailto:users-***@open-mpi.org] On Behalf Of Douglas Guptill
Sent: Sunday, December 06, 2009 15:53
To: ***@open-mpi.org
Subject: Re: [OMPI users] Mimicking timeout for MPI_Wait
Post by Katz, Jacob
Thanks.
Yes, I meant in the question that I was looking for something creative, both fast responding and not using 100% CPU all the time.
I guess I’m not the first one to face this question. Have anyone done anything “better” than the simple solution?
My MPI application is a two-process thing, in which data is thrown
back and forth. For the most part, one process is calculating, and
the other is waiting.

I got tired of seeing both cpus at 100% load, and based on suggestions
from Jeff Squyres and Eugene Loh, wrote MPI_Recv.c and MPI_Send.c. I
load these with my application, and bingo! Only one cpu busy at any
given time.

They use a graduated sleep; the first sleep is short, the second is
twice as long, and so on, up to a maximum sleep time.

I sent the code along with my last message on the subject (December
2008, or later) so it should be in the archives. Failing that, I
could post it again, if anyone wants it.

Douglas.
_______________________________________________
users mailing list
***@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
Katz, Jacob
2009-12-06 16:15:25 UTC
Permalink
By the way, there is no way to time-out a call to MPI_Init(), or is there?

--------------------------------
Jacob M. Katz | ***@intel.com | Work: +972-4-865-5726 | iNet: (8)-465-5726


-----Original Message-----
From: users-***@open-mpi.org [mailto:users-***@open-mpi.org] On Behalf Of Katz, Jacob
Sent: Sunday, December 06, 2009 17:42
To: Open MPI Users
Subject: Re: [OMPI users] Mimicking timeout for MPI_Wait

Thanks, Douglas.
I found your code in the archive.
--------------------------------
Jacob M. Katz | ***@intel.com | Work: +972-4-865-5726 | iNet: (8)-465-5726


-----Original Message-----
From: users-***@open-mpi.org [mailto:users-***@open-mpi.org] On Behalf Of Douglas Guptill
Sent: Sunday, December 06, 2009 15:53
To: ***@open-mpi.org
Subject: Re: [OMPI users] Mimicking timeout for MPI_Wait
Post by Katz, Jacob
Thanks.
Yes, I meant in the question that I was looking for something creative, both fast responding and not using 100% CPU all the time.
I guess I’m not the first one to face this question. Have anyone done anything “better” than the simple solution?
My MPI application is a two-process thing, in which data is thrown
back and forth. For the most part, one process is calculating, and
the other is waiting.

I got tired of seeing both cpus at 100% load, and based on suggestions
from Jeff Squyres and Eugene Loh, wrote MPI_Recv.c and MPI_Send.c. I
load these with my application, and bingo! Only one cpu busy at any
given time.

They use a graduated sleep; the first sleep is short, the second is
twice as long, and so on, up to a maximum sleep time.

I sent the code along with my last message on the subject (December
2008, or later) so it should be in the archives. Failing that, I
could post it again, if anyone wants it.

Douglas.
_______________________________________________
users mailing list
***@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

_______________________________________________
users mailing list
***@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
Richard Treumann
2009-12-07 13:21:46 UTC
Permalink
The need for a "better" timeout depends on what else there is for the CPU
to do.

If you get creative and shift from {99% MPI_WAIT , 1% OS_idle_process} to
{1% MPI_Wait, 99% OS_idle_process} at a cost of only a few extra
microseconds added lag on MPI_Wait, you may be pleased by the CPU load
statistic but still have have only hurt yourself. Perhaps you have not hurt
yourself much but for what? The CPU does not get tired of spinning in
MPI_Wait rather than in the OS_idle_process

Most MPI applications run with an essentially dedicated CPU per process. In
most MPI applications if even one task is sharing its CPU with other
processes, like users doing compiles, the whole job slows down too much.

There are exceptions. For example, a work farm, in which a master doles
out a chunk of work, takes back the result as a worker produces one and
then doles out another chunk can get valuable work from CPUs that have
other useful work to do too and in that situation it can be a big win to
accept lag time in the MPI_Wait in return for making the CPU available to
another process. The symptom that you need a "better" MPI_Wait then will
probably be more like {50% MPI_WAIT , 50% other process}

Dick Treumann - MPI Team
IBM Systems & Technology Group
Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
Tele (845) 433-7846 Fax (845) 433-8363
[image removed]
Re: [OMPI users] Mimicking timeout for MPI_Wait
Katz, Jacob
Open MPI Users
12/06/2009 10:47 AM
Please respond to Open MPI Users
Thanks, Douglas.
I found your code in the archive.
--------------------------------
-----Original Message-----
On Behalf Of Douglas Guptill
Sent: Sunday, December 06, 2009 15:53
Subject: Re: [OMPI users] Mimicking timeout for MPI_Wait
Post by Katz, Jacob
Thanks.
Yes, I meant in the question that I was looking for something
creative, both fast responding and not using 100% CPU all the time.
Post by Katz, Jacob
I guess I’m not the first one to face this question. Have anyone
done anything “better” than the simple solution?
My MPI application is a two-process thing, in which data is thrown
back and forth. For the most part, one process is calculating, and
the other is waiting.
I got tired of seeing both cpus at 100% load, and based on suggestions
from Jeff Squyres and Eugene Loh, wrote MPI_Recv.c and MPI_Send.c. I
load these with my application, and bingo! Only one cpu busy at any
given time.
They use a graduated sleep; the first sleep is short, the second is
twice as long, and so on, up to a maximum sleep time.
I sent the code along with my last message on the subject (December
2008, or later) so it should be in the archives. Failing that, I
could post it again, if anyone wants it.
Douglas.
_______________________________________________
users mailing list
http://www.open-mpi.org/mailman/listinfo.cgi/users
---------------------------------------------------------------------
Intel Israel (74) Limited
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
_______________________________________________
users mailing list
http://www.open-mpi.org/mailman/listinfo.cgi/users
Douglas Guptill
2009-12-07 15:13:48 UTC
Permalink
Post by Richard Treumann
The need for a "better" timeout depends on what else there is for the CPU
to do.
If you get creative and shift from {99% MPI_WAIT , 1% OS_idle_process} to
{1% MPI_Wait, 99% OS_idle_process} at a cost of only a few extra
microseconds added lag on MPI_Wait, you may be pleased by the CPU load
statistic but still have have only hurt yourself. Perhaps you have not hurt
yourself much but for what? The CPU does not get tired of spinning in
MPI_Wait rather than in the OS_idle_process
Most MPI applications run with an essentially dedicated CPU per process.
Not true in our case. The computer in question (Intel Core i7, one
cpu, four cores) has several other uses.

It is a general purpose desktop/server for myself, and potential other
users. I edit and compile the MPI application on it. I read and
write email from it. My subversion repositories and server will soon
be on it. My Trac server (and Apache2) will soon be on it.

Now that MPI does not do busy waits, it can do all that, and run 4
copies of our MPI application.
Post by Richard Treumann
In most MPI applications if even one task is sharing its CPU with
other processes, like users doing compiles, the whole job slows down
too much.
I have not found that to be the case.

Regards,
Douglas.
--
Douglas Guptill voice: 902-461-9749
Research Assistant, LSC 4640 email: ***@dal.ca
Oceanography Department fax: 902-494-3877
Dalhousie University
Halifax, NS, B3H 4J1, Canada
George Bosilca
2009-12-07 16:57:33 UTC
Permalink
There are many papers published at this subject. Google scholar with a search for "system noise" will give you a starting point.

george.
Post by Douglas Guptill
Post by Richard Treumann
In most MPI applications if even one task is sharing its CPU with
other processes, like users doing compiles, the whole job slows down
too much.
I have not found that to be the case.
Number Cruncher
2009-12-08 10:14:09 UTC
Permalink
Whilst MPI has traditionally been run on dedicated hardware, the rise of
cheap multicore CPUs makes it very attractive for ISVs such as ourselves
(http://www.cambridgeflowsolutions.com/) to build a *single* executable
that can be run in batch mode on a dedicated cluster *or* interactively
on a user's workstation.

Once you've taken the pain of writing a distributed-memory app (rather
than shared-memory/multithreaded), MPI provides a transparent API to
cover both use cases above. *However*, at the moment, the lack of
select()-like behaviour (instead of polling) means we have to write
custom code to avoid hogging a workstation. A runtime-selectable
mechanism would be perfect!

Is there any formal mechanism for garnering whether there is a wider
appetite for such functionality amongst Open MPI users?
Post by George Bosilca
There are many papers published at this subject. Google scholar with a search for "system noise" will give you a starting point.
george.
Post by Douglas Guptill
Post by Richard Treumann
In most MPI applications if even one task is sharing its CPU with
other processes, like users doing compiles, the whole job slows down
too much.
I have not found that to be the case.
_______________________________________________
users mailing list
http://www.open-mpi.org/mailman/listinfo.cgi/users
Ashley Pittman
2009-12-10 13:37:23 UTC
Permalink
Post by Number Cruncher
Whilst MPI has traditionally been run on dedicated hardware, the rise of
cheap multicore CPUs makes it very attractive for ISVs such as ourselves
(http://www.cambridgeflowsolutions.com/) to build a *single* executable
that can be run in batch mode on a dedicated cluster *or* interactively
on a user's workstation.
Once you've taken the pain of writing a distributed-memory app (rather
than shared-memory/multithreaded), MPI provides a transparent API to
cover both use cases above. *However*, at the moment, the lack of
select()-like behaviour (instead of polling) means we have to write
custom code to avoid hogging a workstation. A runtime-selectable
mechanism would be perfect!
Speaking as an independent observer here (i.e. not a OMPI developer) I
don't think you'll find anyone who wouldn't view what you are asking for
as a good thing, it's something that has been and is continued to be
discussed often. I for one would love to see it, whilst as Richard says
it can increase latency it can also reduce noise so help performance on
larger systems.

As you say you are one of a new breed of MPI users and this feature
would most likely benefit you more than the traditional
dedicated-machine users of MPI, I expect it to become more of an issue
as MPI is adopted by a wider audience. As OpenMPI is a open-source
project the question should not be what appetite is there amongst users
but is there any one user who is both motivated enough, able to do the
work and finally not busy doing other things. I've implemented this
before and it's not an easy feature to add by any means and tends to be
very intrusive into the code-base which itself causes problems.

There was another thread on this mailing list this week where Ralph
recommended setting the yield_when_idle mca param ("--mca
yield_when_idle 1) which will cause threads to call sched_yield() when
polling. The end result here is that they will still consume 100% of
idle CPU time but then other programs want to use the CPU the MPI
processes will not hog it but rather let the other processes use as much
CPU time as they want and just spin when the CPU would otherwise be
idle. This is something I use daily and greatly increases the
responsiveness of systems which are mixing idle MPI with other
applications.

Ashley,
--
Ashley Pittman, Bath, UK.

Padb - A parallel job inspection tool for cluster computing
http://padb.pittman.org.uk
Eugene Loh
2009-12-07 02:42:55 UTC
Permalink
_______________________________________________
users mailing list
***@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
Tim Prince
2009-12-04 20:12:55 UTC
Permalink
Post by Jeff Squyres
I wonder if there is a BKM (efficient and portable) to mimic a timeout with a call to MPI_Wait, i.e. to interrupt it once a given time period has passed if it hasn’t returned by then yet.
Pardon my ignorance, but what does BKM stand for?
what, you didn't rub shoulders with enough Intel people lately?
Loading...