Discussion:
[OMPI users] openmpi with htcondor
Mahmood Naderan
2018-01-25 15:37:11 UTC
Permalink
Hi,
Has anyone here used htcondor scheduler with mpi jobs? I followed the
example, openmpiscript, in the condor folder like this

[***@rocks7 ~]$ cat mpi.ht

universe = parallel

executable = openmpiscript

arguments = mpihello

log = hellompi.log

output = hellompi.out

error = hellompi.err

machine_count = 2


However, it fails with this error

​


[***@rocks7 ~]$ cat hellompi.out
WARNING: MOUNT_UNDER_SCRATCH not set in condor_config
WARNING: MOUNT_UNDER_SCRATCH not set in condor_config
-------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status,
thus causing
the job to be terminated. The first process to do so was:

Process name: [[62274,1],0]
Exit code: 1
--------------------------------------------------------------------------
[***@rocks7 ~]$ cat hellompi.err
Not defined: MOUNT_UNDER_SCRATCH
Not defined: MOUNT_UNDER_SCRATCH
[compute-0-1.local:17511] [[62274,1],0] usock_peer_recv_connect_ack:
received unexpected process identifier [[62274,0],2] from [[62274,0],1]
[compute-0-1.local:17512] [[62274,1],1] usock_peer_recv_connect_ack:
received unexpected process identifier [[62274,0],2] from [[62274,0],1]




​

A
​ny idea?


​
Regards,
Mahmood

Loading...