Reuti
2017-02-28 10:17:01 UTC
Hi,
Only by reading recent posts I got aware of the DVM. This would be a welcome feature for our setup*. But I see not all options working as expected - is it still a work in progress, or should all work as advertised?
1)
$ ***@server:~> orte-submit -cf foo --hnp file:/home/reuti/dvmuri -n 1 touch /home/reuti/hacked
----------------------------------------------------------------------------
Open MPI has detected that a parameter given to a command line
option does not match the expected format:
Option: np
Param: foo
==> The given option is -cf, not -np
2)
According to `man orte-dvm` there is -H, -host, --host, -machinefile, -hostfile but none of them seem operational (Open MPI 2.0.2). A given hostlist given by SGE is honored though.
-- Reuti
*) We run Open MPI jobs inside SGE. This works fine. Some applications invoke several `mpiexec`-calls during their execution and rely on temporary files they created in the last step(s). While this is working fine on one and the same machine, it fails in case SGE granted slots on several machines as the scratch directories created by `qrsh -inherit âŠ` vanish once the `mpiexec`-call on this particular node finishes (and not at the end of the complete job). I can mimic persistent scratch directories in SGE for a complete job, but invoking the DVM before and shutting it down later on (either by hand in the job script or by SGE killing all remains at the end of the job) might be more straight forward (looks like `orte-dvm` is started by `qrsh -inherit âŠ` too).
Only by reading recent posts I got aware of the DVM. This would be a welcome feature for our setup*. But I see not all options working as expected - is it still a work in progress, or should all work as advertised?
1)
$ ***@server:~> orte-submit -cf foo --hnp file:/home/reuti/dvmuri -n 1 touch /home/reuti/hacked
----------------------------------------------------------------------------
Open MPI has detected that a parameter given to a command line
option does not match the expected format:
Option: np
Param: foo
==> The given option is -cf, not -np
2)
According to `man orte-dvm` there is -H, -host, --host, -machinefile, -hostfile but none of them seem operational (Open MPI 2.0.2). A given hostlist given by SGE is honored though.
-- Reuti
*) We run Open MPI jobs inside SGE. This works fine. Some applications invoke several `mpiexec`-calls during their execution and rely on temporary files they created in the last step(s). While this is working fine on one and the same machine, it fails in case SGE granted slots on several machines as the scratch directories created by `qrsh -inherit âŠ` vanish once the `mpiexec`-call on this particular node finishes (and not at the end of the complete job). I can mimic persistent scratch directories in SGE for a complete job, but invoking the DVM before and shutting it down later on (either by hand in the job script or by SGE killing all remains at the end of the job) might be more straight forward (looks like `orte-dvm` is started by `qrsh -inherit âŠ` too).