Mahmood Naderan
2016-09-26 10:22:34 UTC
Hi,
When I run an MPI command through the terminal the programs runs fine on
the compute node specified in hosts.txt.
However, when I put that command in a PBS script, if says that the compute
node is not defined in the job manager's list. However, that node is
actually defined in the job manager.
Please see the output below
***@cluster:tran-bt-o-40$ cat submit.tor
#!/bin/bash
#PBS -V
#PBS -q default
#PBS -j oe
#PBS -l nodes=1:ppn=15
#PBS -N job-1
#PBS -o /home/mahmood/tran-bt-o-40/cc-bt-cc-163-20.out
cd $PBS_O_WORKDIR
/share/apps/computer/openmpi-2.0.0/bin/mpirun -hostfile hosts.txt -np 15
/share/apps/chemistry/siesta-4.0/tpar/transiesta < trans-cc-bt-cc-163-20.fdf
***@cluster:tran-bt-o-40$ cat cc-bt-cc-163-20.out
--------------------------------------------------------------------------
A hostfile was provided that contains at least one node not
present in the allocation:
hostfile: hosts.txt
node: compute-0-1
If you are operating in a resource-managed environment, then only
nodes that are in the allocation can be used in the hostfile. You
may find relative node syntax to be a useful alternative to
specifying absolute node names see the orte_hosts man page for
further information.
--------------------------------------------------------------------------
***@cluster:tran-bt-o-40$ cat hosts.txt
compute-0-1
compute-0-2
***@cluster:tran-bt-o-40$ pbsnodes -l all
compute-0-0 down
compute-0-1 free
compute-0-2 free
compute-0-3 free
As you can see, compute-0-1 has free cores and it is defined for the
manager.
Any idea?
Regards,
Mahmood
When I run an MPI command through the terminal the programs runs fine on
the compute node specified in hosts.txt.
However, when I put that command in a PBS script, if says that the compute
node is not defined in the job manager's list. However, that node is
actually defined in the job manager.
Please see the output below
***@cluster:tran-bt-o-40$ cat submit.tor
#!/bin/bash
#PBS -V
#PBS -q default
#PBS -j oe
#PBS -l nodes=1:ppn=15
#PBS -N job-1
#PBS -o /home/mahmood/tran-bt-o-40/cc-bt-cc-163-20.out
cd $PBS_O_WORKDIR
/share/apps/computer/openmpi-2.0.0/bin/mpirun -hostfile hosts.txt -np 15
/share/apps/chemistry/siesta-4.0/tpar/transiesta < trans-cc-bt-cc-163-20.fdf
***@cluster:tran-bt-o-40$ cat cc-bt-cc-163-20.out
--------------------------------------------------------------------------
A hostfile was provided that contains at least one node not
present in the allocation:
hostfile: hosts.txt
node: compute-0-1
If you are operating in a resource-managed environment, then only
nodes that are in the allocation can be used in the hostfile. You
may find relative node syntax to be a useful alternative to
specifying absolute node names see the orte_hosts man page for
further information.
--------------------------------------------------------------------------
***@cluster:tran-bt-o-40$ cat hosts.txt
compute-0-1
compute-0-2
***@cluster:tran-bt-o-40$ pbsnodes -l all
compute-0-0 down
compute-0-1 free
compute-0-2 free
compute-0-3 free
As you can see, compute-0-1 has free cores and it is defined for the
manager.
Any idea?
Regards,
Mahmood