Discussion:
[OMPI users] MPI_File_write_shared() and MPI_MODE_APPEND issue ?
Nicolas Joly
2017-01-18 15:36:51 UTC
Permalink
Hi,

We have a tool where all workers will use MPI_File_write_shared() on a
file that was opened with MPI_MODE_APPEND, mostly because rank 0 will
have written some format specific header data.

We recently upgraded our openmpi version from v1.10.4 to v2.0.1. And
at that time we noticed a behaviour change ... ompio do not show the
same result as romio with the attached code.

***@tars-submit0 [tmp/mpiio]> mpirun --version
mpirun (Open MPI) 2.0.1
[...]
***@tars-submit0 [tmp/mpiio]> mpirun -n 1 --mca io romio314 ./openappend
***@tars-submit0 [tmp/mpiio]> echo $?
0
***@tars-submit0 [tmp/mpiio]> cat openappend.test
Header line
Data line
***@tars-submit0 [tmp/mpiio]> mpirun -n 1 --mca io ompio ./openappend
***@tars-submit0 [tmp/mpiio]> echo $?
0
***@tars-submit0 [tmp/mpiio]> cat openappend.test
Data line
e

With ompio, it seems that, for some reason, the shared file pointer
was reset/initialised(?) to zero ... leading to an unexpected write
position for the "Data line" buffer.

Thanks in advance.
Regards.
--
Nicolas Joly

Cluster & Computing Group
Biology IT Center
Institut Pasteur, Paris.
Edgar Gabriel
2017-01-18 16:04:59 UTC
Permalink
I will look into this, I have a suspicion on what might be wrong. Give
me a day or three.

Thanks

EDgar
Post by Nicolas Joly
Hi,
We have a tool where all workers will use MPI_File_write_shared() on a
file that was opened with MPI_MODE_APPEND, mostly because rank 0 will
have written some format specific header data.
We recently upgraded our openmpi version from v1.10.4 to v2.0.1. And
at that time we noticed a behaviour change ... ompio do not show the
same result as romio with the attached code.
mpirun (Open MPI) 2.0.1
[...]
0
Header line
Data line
0
Data line
e
With ompio, it seems that, for some reason, the shared file pointer
was reset/initialised(?) to zero ... leading to an unexpected write
position for the "Data line" buffer.
Thanks in advance.
Regards.
Edgar Gabriel
2017-01-23 15:05:44 UTC
Permalink
just wanted to give a brief update on this. The problem was in fact that
we did not correctly move the shared file pointer to the end of the file
when a file is opened in append mode. (The individual file pointer did
the right thing however). The patch itself is not overly complected, I
filed a pr towards masters, and will create pr for the 2.0 and 2.1
release later as well. I am not sure however whether it will make it in
time for the 2.0.2 release, it might be too late for that.

Thanks for the bug report!

Edgar
Post by Nicolas Joly
Hi,
We have a tool where all workers will use MPI_File_write_shared() on a
file that was opened with MPI_MODE_APPEND, mostly because rank 0 will
have written some format specific header data.
We recently upgraded our openmpi version from v1.10.4 to v2.0.1. And
at that time we noticed a behaviour change ... ompio do not show the
same result as romio with the attached code.
mpirun (Open MPI) 2.0.1
[...]
0
Header line
Data line
0
Data line
e
With ompio, it seems that, for some reason, the shared file pointer
was reset/initialised(?) to zero ... leading to an unexpected write
position for the "Data line" buffer.
Thanks in advance.
Regards.
Nicolas Joly
2017-01-24 13:52:36 UTC
Permalink
Post by Edgar Gabriel
just wanted to give a brief update on this. The problem was in fact that
we did not correctly move the shared file pointer to the end of the file
when a file is opened in append mode. (The individual file pointer did
the right thing however).
Thanks for the explanation.
Post by Edgar Gabriel
The patch itself is not overly complected, I filed a pr towards
masters, and will create pr for the 2.0 and 2.1 release later as
well. I am not sure however whether it will make it in time for the
2.0.2 release, it might be too late for that.
No hurry. We are in the process of validating our codes with the new
ompio backend ... And we still have romio as a fallback.

Thanks again.
Post by Edgar Gabriel
Thanks for the bug report!
Edgar
Post by Nicolas Joly
Hi,
We have a tool where all workers will use MPI_File_write_shared() on a
file that was opened with MPI_MODE_APPEND, mostly because rank 0 will
have written some format specific header data.
We recently upgraded our openmpi version from v1.10.4 to v2.0.1. And
at that time we noticed a behaviour change ... ompio do not show the
same result as romio with the attached code.
mpirun (Open MPI) 2.0.1
[...]
0
Header line
Data line
0
Data line
e
With ompio, it seems that, for some reason, the shared file pointer
was reset/initialised(?) to zero ... leading to an unexpected write
position for the "Data line" buffer.
Thanks in advance.
Regards.
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
--
Nicolas Joly

Cluster & Computing Group
Biology IT Center
Institut Pasteur, Paris.
Loading...