Discussion:
[OMPI users] State of the DVM in Open MPI
Reuti
2017-02-28 10:17:01 UTC
Permalink
Hi,

Only by reading recent posts I got aware of the DVM. This would be a welcome feature for our setup*. But I see not all options working as expected - is it still a work in progress, or should all work as advertised?

1)

$ ***@server:~> orte-submit -cf foo --hnp file:/home/reuti/dvmuri -n 1 touch /home/reuti/hacked
----------------------------------------------------------------------------
Open MPI has detected that a parameter given to a command line
option does not match the expected format:

Option: np
Param: foo

==> The given option is -cf, not -np

2)

According to `man orte-dvm` there is -H, -host, --host, -machinefile, -hostfile but none of them seem operational (Open MPI 2.0.2). A given hostlist given by SGE is honored though.

-- Reuti


*) We run Open MPI jobs inside SGE. This works fine. Some applications invoke several `mpiexec`-calls during their execution and rely on temporary files they created in the last step(s). While this is working fine on one and the same machine, it fails in case SGE granted slots on several machines as the scratch directories created by `qrsh -inherit 
` vanish once the `mpiexec`-call on this particular node finishes (and not at the end of the complete job). I can mimic persistent scratch directories in SGE for a complete job, but invoking the DVM before and shutting it down later on (either by hand in the job script or by SGE killing all remains at the end of the job) might be more straight forward (looks like `orte-dvm` is started by `qrsh -inherit 
` too).
r***@open-mpi.org
2017-02-28 10:25:02 UTC
Permalink
Hi Reuti

The DVM in master seems to be fairly complete, but several organizations are in the process of automating tests for it so it gets more regular exercise.

If you are using a version in OMPI 2.x, those are early prototype - we haven’t updated the code in the release branches. The more production-ready version will be in 3.0, and we’ll start supporting it there.

Meantime, we do appreciate any suggestions and bug reports as we polish it up.
Post by Reuti
Hi,
Only by reading recent posts I got aware of the DVM. This would be a welcome feature for our setup*. But I see not all options working as expected - is it still a work in progress, or should all work as advertised?
1)
----------------------------------------------------------------------------
Open MPI has detected that a parameter given to a command line
Option: np
Param: foo
==> The given option is -cf, not -np
2)
According to `man orte-dvm` there is -H, -host, --host, -machinefile, -hostfile but none of them seem operational (Open MPI 2.0.2). A given hostlist given by SGE is honored though.
-- Reuti
*) We run Open MPI jobs inside SGE. This works fine. Some applications invoke several `mpiexec`-calls during their execution and rely on temporary files they created in the last step(s). While this is working fine on one and the same machine, it fails in case SGE granted slots on several machines as the scratch directories created by `qrsh -inherit …` vanish once the `mpiexec`-call on this particular node finishes (and not at the end of the complete job). I can mimic persistent scratch directories in SGE for a complete job, but invoking the DVM before and shutting it down later on (either by hand in the job script or by SGE killing all remains at the end of the job) might be more straight forward (looks like `orte-dvm` is started by `qrsh -inherit …` too).
_______________________________________________
users mailing list
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
Loading...