Discussion:
[OMPI users] Issue handling SIGUSR1 in OpenMPI
Marc Cooper
2017-07-25 17:15:40 UTC
Permalink
Hi all,

I'm working to understand signal handling in OpenMPI. I read that "Open MPI
will forward SIGUSR1 and SIGUSR2 from mpiexec to the other processes". My
question is that is this feature enabled by default installation.

The scenario is that one MPI process raises a SIGUSR1, which has to be
detected by 'orted' which is then forwarded to other processes.

In my test code, I define a custom signal handler for SIGUSR1 and register
this signal handler accordingly. I send a signal by using kill() or
raise(). I assume that ORTE daemon will receive this signal and has to
forward this signal to the remaining processes.

// test.c

void handle_signal(int signal){
if(SIGNAL==SIGUSR1)
printf("received SIGUSR1 signal \n");
}
int main(){
MPI_Init(NULL, NULL);

int my_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);

signal(SIGUSR1, handle_signal);

if(my_rank == 1) // process with rank 1 raises SIGUSR1
kill(getpid(), SIGUSR1);

MPI_Finalize();
}

If I run this as
mpirun -np 3 ./test

I would expect to have the statement printed twice from the other two
processes. But when I run this code, it only prints once, and that too from
ORTE HNP, unlike the application processes. Do I need to call any other API
on orted explicitly pass this signal, so that the application processes
receive the SIGUSR1.

-
Marc
r***@open-mpi.org
2017-07-25 18:17:08 UTC
Permalink
I’m afraid we don’t currently support that use-case. We forward signals sent by the user to mpiexec (i.e., the user “hits” mpiexec with a signal), but we don’t do anything to support an application proc attempting to raise a signal and asking it to be propagated.

If you are using OMPI master, or the soon-to-be-released v3.0, then you might be able to do what you seek using the PMIx event notification system.
Post by Marc Cooper
Hi all,
I'm working to understand signal handling in OpenMPI. I read that "Open MPI will forward SIGUSR1 and SIGUSR2 from mpiexec to the other processes". My question is that is this feature enabled by default installation.
The scenario is that one MPI process raises a SIGUSR1, which has to be detected by 'orted' which is then forwarded to other processes.
In my test code, I define a custom signal handler for SIGUSR1 and register this signal handler accordingly. I send a signal by using kill() or raise(). I assume that ORTE daemon will receive this signal and has to forward this signal to the remaining processes.
// test.c
void handle_signal(int signal){
if(SIGNAL==SIGUSR1)
printf("received SIGUSR1 signal \n");
}
int main(){
MPI_Init(NULL, NULL);
int my_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
signal(SIGUSR1, handle_signal);
if(my_rank == 1) // process with rank 1 raises SIGUSR1
kill(getpid(), SIGUSR1);
MPI_Finalize();
}
If I run this as
mpirun -np 3 ./test
I would expect to have the statement printed twice from the other two processes. But when I run this code, it only prints once, and that too from ORTE HNP, unlike the application processes. Do I need to call any other API on orted explicitly pass this signal, so that the application processes receive the SIGUSR1.
-
Marc
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
Marc Cooper
2017-07-25 18:40:54 UTC
Permalink
Even this method of raising signal from user to mpiexec results in signal
handling by only one process. I've modified my earlier example where each
process publishes its pid, and I capture the pid and raise the signal using
'kill -SIGUSR1 <pid>' from another terminal.

// test.c

void handle_signal(int signal){
if(SIGNAL==SIGUSR1)
printf("received SIGUSR1 signal \n");
}
int main(){
MPI_Init(NULL, NULL);

int my_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);

signal(SIGUSR1, handle_signal);

printf("My pid is: %d\n", getpid());

for (;;) {
printf("\nSleeping for 10 seconds\n");
sleep(10);

MPI_Finalize();
}

When I run with 3 processes using mpirun -np 3 ./test, I expect the
statement 'received SIGUSR1 signal' twice, but it prints just once. What am
I missing here?
Post by r***@open-mpi.org
I’m afraid we don’t currently support that use-case. We forward signals
sent by the user to mpiexec (i.e., the user “hits” mpiexec with a signal),
but we don’t do anything to support an application proc attempting to raise
a signal and asking it to be propagated.
If you are using OMPI master, or the soon-to-be-released v3.0, then you
might be able to do what you seek using the PMIx event notification system.
Hi all,
I'm working to understand signal handling in OpenMPI. I read that "Open
MPI will forward SIGUSR1 and SIGUSR2 from mpiexec to the other
processes". My question is that is this feature enabled by default
installation.
The scenario is that one MPI process raises a SIGUSR1, which has to be
detected by 'orted' which is then forwarded to other processes.
In my test code, I define a custom signal handler for SIGUSR1 and register
this signal handler accordingly. I send a signal by using kill() or
raise(). I assume that ORTE daemon will receive this signal and has to
forward this signal to the remaining processes.
// test.c
void handle_signal(int signal){
if(SIGNAL==SIGUSR1)
printf("received SIGUSR1 signal \n");
}
int main(){
MPI_Init(NULL, NULL);
int my_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
signal(SIGUSR1, handle_signal);
if(my_rank == 1) // process with rank 1 raises SIGUSR1
kill(getpid(), SIGUSR1);
MPI_Finalize();
}
If I run this as
mpirun -np 3 ./test
I would expect to have the statement printed twice from the other two
processes. But when I run this code, it only prints once, and that too from
ORTE HNP, unlike the application processes. Do I need to call any other API
on orted explicitly pass this signal, so that the application processes
receive the SIGUSR1.
-
Marc
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
r***@open-mpi.org
2017-07-25 18:45:18 UTC
Permalink
Again, you are sending the signal to just the one process whose pid you specified. We don’t pick that signal up and propagate it. If you signal the pid of mpiexec itself, then you’d see every proc report it.
Even this method of raising signal from user to mpiexec results in signal handling by only one process. I've modified my earlier example where each process publishes its pid, and I capture the pid and raise the signal using 'kill -SIGUSR1 <pid>' from another terminal.
// test.c
void handle_signal(int signal){
if(SIGNAL==SIGUSR1)
printf("received SIGUSR1 signal \n");
}
int main(){
MPI_Init(NULL, NULL);
int my_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
signal(SIGUSR1, handle_signal);
printf("My pid is: %d\n", getpid());
for (;;) {
printf("\nSleeping for 10 seconds\n");
sleep(10);
MPI_Finalize();
}
When I run with 3 processes using mpirun -np 3 ./test, I expect the statement 'received SIGUSR1 signal' twice, but it prints just once. What am I missing here?
I’m afraid we don’t currently support that use-case. We forward signals sent by the user to mpiexec (i.e., the user “hits” mpiexec with a signal), but we don’t do anything to support an application proc attempting to raise a signal and asking it to be propagated.
If you are using OMPI master, or the soon-to-be-released v3.0, then you might be able to do what you seek using the PMIx event notification system.
Post by Marc Cooper
Hi all,
I'm working to understand signal handling in OpenMPI. I read that "Open MPI will forward SIGUSR1 and SIGUSR2 from mpiexec to the other processes". My question is that is this feature enabled by default installation.
The scenario is that one MPI process raises a SIGUSR1, which has to be detected by 'orted' which is then forwarded to other processes.
In my test code, I define a custom signal handler for SIGUSR1 and register this signal handler accordingly. I send a signal by using kill() or raise(). I assume that ORTE daemon will receive this signal and has to forward this signal to the remaining processes.
// test.c
void handle_signal(int signal){
if(SIGNAL==SIGUSR1)
printf("received SIGUSR1 signal \n");
}
int main(){
MPI_Init(NULL, NULL);
int my_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
signal(SIGUSR1, handle_signal);
if(my_rank == 1) // process with rank 1 raises SIGUSR1
kill(getpid(), SIGUSR1);
MPI_Finalize();
}
If I run this as
mpirun -np 3 ./test
I would expect to have the statement printed twice from the other two processes. But when I run this code, it only prints once, and that too from ORTE HNP, unlike the application processes. Do I need to call any other API on orted explicitly pass this signal, so that the application processes receive the SIGUSR1.
-
Marc
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
Marc Cooper
2017-07-25 19:06:13 UTC
Permalink
Got it. I now see each proc reporting the signal. Thank you.
Post by r***@open-mpi.org
Again, you are sending the signal to just the one process whose pid you
specified. We don’t pick that signal up and propagate it. If you signal the
pid of mpiexec itself, then you’d see every proc report it.
Even this method of raising signal from user to mpiexec results in signal
handling by only one process. I've modified my earlier example where each
process publishes its pid, and I capture the pid and raise the signal using
'kill -SIGUSR1 <pid>' from another terminal.
// test.c
void handle_signal(int signal){
if(SIGNAL==SIGUSR1)
printf("received SIGUSR1 signal \n");
}
int main(){
MPI_Init(NULL, NULL);
int my_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
signal(SIGUSR1, handle_signal);
printf("My pid is: %d\n", getpid());
for (;;) {
printf("\nSleeping for 10 seconds\n");
sleep(10);
MPI_Finalize();
}
When I run with 3 processes using mpirun -np 3 ./test, I expect the
statement 'received SIGUSR1 signal' twice, but it prints just once. What am
I missing here?
Post by r***@open-mpi.org
I’m afraid we don’t currently support that use-case. We forward signals
sent by the user to mpiexec (i.e., the user “hits” mpiexec with a signal),
but we don’t do anything to support an application proc attempting to raise
a signal and asking it to be propagated.
If you are using OMPI master, or the soon-to-be-released v3.0, then you
might be able to do what you seek using the PMIx event notification system.
Hi all,
I'm working to understand signal handling in OpenMPI. I read that "Open
MPI will forward SIGUSR1 and SIGUSR2 from mpiexec to the other
processes". My question is that is this feature enabled by default
installation.
The scenario is that one MPI process raises a SIGUSR1, which has to be
detected by 'orted' which is then forwarded to other processes.
In my test code, I define a custom signal handler for SIGUSR1 and
register this signal handler accordingly. I send a signal by using kill()
or raise(). I assume that ORTE daemon will receive this signal and has to
forward this signal to the remaining processes.
// test.c
void handle_signal(int signal){
if(SIGNAL==SIGUSR1)
printf("received SIGUSR1 signal \n");
}
int main(){
MPI_Init(NULL, NULL);
int my_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
signal(SIGUSR1, handle_signal);
if(my_rank == 1) // process with rank 1 raises SIGUSR1
kill(getpid(), SIGUSR1);
MPI_Finalize();
}
If I run this as
mpirun -np 3 ./test
I would expect to have the statement printed twice from the other two
processes. But when I run this code, it only prints once, and that too from
ORTE HNP, unlike the application processes. Do I need to call any other API
on orted explicitly pass this signal, so that the application processes
receive the SIGUSR1.
-
Marc
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
Loading...