Discussion:
[OMPI users] epoll add error with OpenMPI 2.0.1 and SGE
Dave Turner
2016-12-14 03:57:40 UTC
Permalink
[warn] Epoll ADD(4) on fd 1 failed. Old events were 0; read change was 0
(none); write change was 1 (add): Operation not permitted

Gentoo with compiled OpenMPI 2.0.1 and SGE
ompi_info --all file attached

We recently did a maintenance upgrade to our cluster including
moving to OpenMPI 2.0.1. Fortran programs now give the
epoll add error above at the start of a run and the stdout file
freezes until the end of the run when all info is dumped.

I've read about this problem and it seems to be a file lock
issue where OpenMPI and SGE are both trying to lock the
same output file. We have not seen this problem with
previous versions of OpenMPI.

We've tried compiling OpenMPI with and without
specifying --with-libevent=/usr, and I've tried compiling
with --disable-event-epoll and using -mca opal_event_include poll.
Both of these were suggestions from a few years back but
neither affects the problem. I've also tried redirecting the output
manually as:

mpirun -np 4 ./app > file.out

This just locks file.out instead with all the output again being
dumped at the end of the run.

We also do not have this issue with 1.10.4 installed.

Any suggestions? Has anyone else run into this problem?

Dave Turner
--
Work: ***@ksu.edu (785) 532-7791
2219 Engineering Hall, Manhattan KS 66506
Home: ***@gmail.com
cell: (785) 770-5929
Dave Turner
2016-12-18 01:52:32 UTC
Permalink
I've solved this problem by omitting --with-libevent=/usr from
the configuration to force it to use the internal version. I thought
I had tried this before posting but evidently did something wrong.

Dave
Send users mailing list submissions to
To subscribe or unsubscribe via the World Wide Web, visit
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
or, via email, send a message with subject or body 'help' to
You can reach the person managing the list at
When replying, please edit your Subject line so it is more specific
than "Re: Contents of users digest..."
1. epoll add error with OpenMPI 2.0.1 and SGE (Dave Turner)
----------------------------------------------------------------------
Message: 1
Date: Tue, 13 Dec 2016 21:57:40 -0600
Subject: [OMPI users] epoll add error with OpenMPI 2.0.1 and SGE
mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
[warn] Epoll ADD(4) on fd 1 failed. Old events were 0; read change was 0
(none); write change was 1 (add): Operation not permitted
Gentoo with compiled OpenMPI 2.0.1 and SGE
ompi_info --all file attached
We recently did a maintenance upgrade to our cluster including
moving to OpenMPI 2.0.1. Fortran programs now give the
epoll add error above at the start of a run and the stdout file
freezes until the end of the run when all info is dumped.
I've read about this problem and it seems to be a file lock
issue where OpenMPI and SGE are both trying to lock the
same output file. We have not seen this problem with
previous versions of OpenMPI.
We've tried compiling OpenMPI with and without
specifying --with-libevent=/usr, and I've tried compiling
with --disable-event-epoll and using -mca opal_event_include poll.
Both of these were suggestions from a few years back but
neither affects the problem. I've also tried redirecting the output
mpirun -np 4 ./app > file.out
This just locks file.out instead with all the output again being
dumped at the end of the run.
We also do not have this issue with 1.10.4 installed.
Any suggestions? Has anyone else run into this problem?
Dave Turner
--
2219 Engineering Hall, Manhattan KS 66506
cell: (785) 770-5929
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://rfd.newmexicoconsortium.org/mailman/private/users/
attachments/20161213/beb370b0/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ompi_info.2.0.1.all
Type: application/octet-stream
Size: 202298 bytes
Desc: not available
URL: <https://rfd.newmexicoconsortium.org/mailman/private/users/
attachments/20161213/beb370b0/attachment.obj>
------------------------------
Subject: Digest Footer
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
------------------------------
End of users Digest, Vol 3675, Issue 2
**************************************
--
Work: ***@ksu.edu (785) 532-7791
2219 Engineering Hall, Manhattan KS 66506
Home: ***@gmail.com
cell: (785) 770-5929
Gilles Gouaillardet
2016-12-18 03:08:50 UTC
Permalink
Dave,

thanks for the info

for what it's worth, it is generally a bad idea to --with-xxx=/usr
since you might inadvertently use some other external components.

in your case, --with-libevent=external is what you need if you want to
use an external libevent library installed in /usr

i guess the same comment would apply with /usr/local too

btw, which distro are you using ? is your distro's libevent up to date ?
we might want to add a FAQ entry with a known to be broken libevent


Cheers,

Gilles
Post by Dave Turner
I've solved this problem by omitting --with-libevent=/usr from
the configuration to force it to use the internal version. I thought
I had tried this before posting but evidently did something wrong.
Dave
Send users mailing list submissions to
To subscribe or unsubscribe via the World Wide Web, visit
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
or, via email, send a message with subject or body 'help' to
You can reach the person managing the list at
When replying, please edit your Subject line so it is more specific
than "Re: Contents of users digest..."
1. epoll add error with OpenMPI 2.0.1 and SGE (Dave Turner)
----------------------------------------------------------------------
Message: 1
Date: Tue, 13 Dec 2016 21:57:40 -0600
Subject: [OMPI users] epoll add error with OpenMPI 2.0.1 and SGE
Content-Type: text/plain; charset="utf-8"
[warn] Epoll ADD(4) on fd 1 failed. Old events were 0; read change was 0
(none); write change was 1 (add): Operation not permitted
Gentoo with compiled OpenMPI 2.0.1 and SGE
ompi_info --all file attached
We recently did a maintenance upgrade to our cluster including
moving to OpenMPI 2.0.1. Fortran programs now give the
epoll add error above at the start of a run and the stdout file
freezes until the end of the run when all info is dumped.
I've read about this problem and it seems to be a file lock
issue where OpenMPI and SGE are both trying to lock the
same output file. We have not seen this problem with
previous versions of OpenMPI.
We've tried compiling OpenMPI with and without
specifying --with-libevent=/usr, and I've tried compiling
with --disable-event-epoll and using -mca opal_event_include poll.
Both of these were suggestions from a few years back but
neither affects the problem. I've also tried redirecting the output
mpirun -np 4 ./app > file.out
This just locks file.out instead with all the output again being
dumped at the end of the run.
We also do not have this issue with 1.10.4 installed.
Any suggestions? Has anyone else run into this problem?
Dave Turner
--
2219 Engineering Hall, Manhattan KS 66506
cell: (785) 770-5929
-------------- next part --------------
An HTML attachment was scrubbed...
<https://rfd.newmexicoconsortium.org/mailman/private/users/attachments/20161213/beb370b0/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ompi_info.2.0.1.all
Type: application/octet-stream
Size: 202298 bytes
Desc: not available
<https://rfd.newmexicoconsortium.org/mailman/private/users/attachments/20161213/beb370b0/attachment.obj>
------------------------------
Subject: Digest Footer
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
------------------------------
End of users Digest, Vol 3675, Issue 2
**************************************
--
2219 Engineering Hall, Manhattan KS 66506
cell: (785) 770-5929
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
Loading...