I said I'd report back about trying ompio on lustre mounted without flock.
I couldn't immediately figure out how to run MTT. I tried the parallel
hdf5 tests from the hdf5 1.10.3, but I got errors with that even with
the relevant environment variable to put the files on (local) /tmp.
Then it occurred to me rather late that romio would have tests. Using
the "runtests" script modified to use "--mca io ompio" in the romio/test
directory from ompi 3.1.2 on no-flock-mounted Lustre, after building the
tests with an installed ompi-3.1.2, it did this and apparently hung at
the end:
**** Testing simple.c ****
No Errors
**** Testing async.c ****
No Errors
**** Testing async-multiple.c ****
No Errors
**** Testing atomicity.c ****
Process 3: readbuf[118] is 0, should be 10
Process 2: readbuf[65] is 0, should be 10
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD
with errorcode 1.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
Process 1: readbuf[145] is 0, should be 10
**** Testing coll_test.c ****
No Errors
**** Testing excl.c ****
error opening file test
error opening file test
error opening file test
Then I ran on local /tmp as a sanity check and still got errors:
**** Testing I/O functions ****
**** Testing simple.c ****
No Errors
**** Testing async.c ****
No Errors
**** Testing async-multiple.c ****
No Errors
**** Testing atomicity.c ****
Process 2: readbuf[155] is 0, should be 10
Process 1: readbuf[128] is 0, should be 10
Process 3: readbuf[128] is 0, should be 10
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 2 in communicator MPI_COMM_WORLD
with errorcode 1.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
**** Testing coll_test.c ****
No Errors
**** Testing excl.c ****
No Errors
**** Testing file_info.c ****
No Errors
**** Testing i_noncontig.c ****
No Errors
**** Testing noncontig.c ****
No Errors
**** Testing noncontig_coll.c ****
No Errors
**** Testing noncontig_coll2.c ****
No Errors
**** Testing aggregation1 ****
No Errors
**** Testing aggregation2 ****
No Errors
**** Testing hindexed ****
No Errors
**** Testing misc.c ****
file pointer posn = 265, should be 10
byte offset = 3020, should be 1080
file pointer posn = 265, should be 10
byte offset = 3020, should be 1080
file pointer posn = 265, should be 10
byte offset = 3020, should be 1080
file pointer posn in bytes = 3280, should be 1000
file pointer posn = 265, should be 10
byte offset = 3020, should be 1080
file pointer posn in bytes = 3280, should be 1000
file pointer posn in bytes = 3280, should be 1000
file pointer posn in bytes = 3280, should be 1000
Found 12 errors
**** Testing shared_fp.c ****
No Errors
**** Testing ordered_fp.c ****
No Errors
**** Testing split_coll.c ****
No Errors
**** Testing psimple.c ****
No Errors
**** Testing error.c ****
File set view did not return an error
Found 1 errors
**** Testing status.c ****
No Errors
**** Testing types_with_zeros ****
No Errors
**** Testing darray_read ****
No Errors
I even got an error with romio on /tmp (modifying the script to use
mpirun --mca io romi314):
**** Testing error.c ****
Unexpected error message MPI_ERR_ARG: invalid argument of some other kind
Found 1 errors