Discussion:
[OMPI users] OpenMPI with-tm is not obeying torque
Anthony Thyssen
2017-09-29 06:01:31 UTC
Permalink
I recompiled the RHEL OpenMPI package to include the configuration option
--with-tm
and it compiled and is working fine.

*# mpirun -V*
mpirun (Open MPI) 1.10.6

*# ompi_info | grep ras*
MCA ras: gridengine (MCA v2.0.0, API v2.0.0, Component v1.10.6)
MCA ras: loadleveler (MCA v2.0.0, API v2.0.0, Component v1.10.6)
MCA ras: simulator (MCA v2.0.0, API v2.0.0, Component v1.10.6)
MCA ras: slurm (MCA v2.0.0, API v2.0.0, Component v1.10.6)
MCA ras: tm (MCA v2.0.0, API v2.0.0, Component v1.10.6)

As you can see "tm" is present, and as you will see below, it is working
(to an extent)

But OpenMPI while it receives the torque supplied
node allocation does not obey that allocation.

For example...

"pbs_hello" batch scrpt
=======8<--------CUT HERE----------
#!/bin/bash
#
# "pbs_hello" batch script to run "mpi_hello" on PBS nodes
#
#PBS -m n
#
# --------------------------------------
echo "PBS Job Number " $(echo $PBS_JOBID | sed 's/\..*//')
echo "PBS batch run on " $(hostname)
echo "Time it was started " $(date +%F_%T)
echo "Current Directory " $(pwd)
echo "Submitted work dir " $PBS_O_WORKDIR
echo "Number of Nodes " $PBS_NP
echo "Nodefile List " $PBS_NODEFILE
cat $PBS_NODEFILE
#env | grep ^PBS_
echo ---------------------------------------
cd "$PBS_O_WORKDIR" # return to the correct sub-directory

# Run MPI displaying the node allocation maps.
mpirun --mca ras_base_verbose 5 --display-map --display-allocation hostname


=======8<--------CUT HERE----------

submitting to torque to run one process on each of 5 dual-core machines

* # qsub -l nodes=5:ppn=1:dualcore pbs_hello*

Results in the following...

Stderr shows it the "tm" is being selected..
=======8<--------CUT HERE----------
[node21.emperor:07150] mca:base:select:( ras) Querying component
[gridengine]
[node21.emperor:07150] mca:base:select:( ras) Skipping component
[gridengine]. Query failed to return a module
[node21.emperor:07150] mca:base:select:( ras) Querying component
[loadleveler]
[node21.emperor:07150] mca:base:select:( ras) Skipping component
[loadleveler]. Query failed to return a module
[node21.emperor:07150] mca:base:select:( ras) Querying component
[simulator]
[node21.emperor:07150] mca:base:select:( ras) Skipping component
[simulator]. Query failed to return a module
[node21.emperor:07150] mca:base:select:( ras) Querying component [slurm]
[node21.emperor:07150] mca:base:select:( ras) Skipping component [slurm].
Query failed to return a module
[node21.emperor:07150] mca:base:select:( ras) Querying component [tm]
[node21.emperor:07150] mca:base:select:( ras) Query of component [tm] set
priority to 100
[node21.emperor:07150] mca:base:select:( ras) Selected component [tm]
=======8<--------CUT HERE----------

While Stdout shows it is picking up the requested allocation.
=======8<--------CUT HERE----------
PBS Job Number 8988
PBS batch run on node21.emperor
Time it was started 2017-09-29_15:52:21
Current Directory /net/shrek.emperor/home/shrek/anthony
Submitted work dir /home/shrek/anthony/mpi-pbs
Number of Nodes 5
Nodefile List /var/lib/torque/aux//8988.shrek.emperor
node21.emperor
node25.emperor
node24.emperor
node23.emperor
node22.emperor
---------------------------------------

====================== ALLOCATED NODES ======================
node21: slots=1 max_slots=0 slots_inuse=0 state=UP
node25.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node24.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node23.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node22.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
=================================================================

====================== ALLOCATED NODES ======================
node21: slots=1 max_slots=0 slots_inuse=0 state=UP
node25.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node24.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node23.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node22.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
=================================================================
Data for JOB [18928,1] offset 0

======================== JOB MAP ========================

Data for node: node21 Num slots: 1 Max slots: 0 Num procs: 1
Process OMPI jobid: [18928,1] App: 0 Process rank: 0

Data for node: node25.emperor Num slots: 1 Max slots: 0 Num procs: 1
Process OMPI jobid: [18928,1] App: 0 Process rank: 1

Data for node: node24.emperor Num slots: 1 Max slots: 0 Num procs: 1
Process OMPI jobid: [18928,1] App: 0 Process rank: 2

Data for node: node23.emperor Num slots: 1 Max slots: 0 Num procs: 1
Process OMPI jobid: [18928,1] App: 0 Process rank: 3

Data for node: node22.emperor Num slots: 1 Max slots: 0 Num procs: 1
Process OMPI jobid: [18928,1] App: 0 Process rank: 4

=============================================================
node21.emperor
node21.emperor
node21.emperor
node21.emperor
node21.emperor
=======8<--------CUT HERE----------

However according to the "hostname" (and was visible in "pbsnodes")

*ALL 5 processes was run on the first node. Vastly over-subscribing that
node.*

Anyone have any ideas as to what went wrong?

*Why did OpenMPI not follow the node mapping it says it should be
following!*

Additional... OpenMPI on its own (without torque) does appear to work as
expected.

*# mpirun -host node21,node22,node23,node24,node25 hostname*
node24.emperor
node22.emperor
node21.emperor
node25.emperor
node23.emperor


Anthony Thyssen ( System Programmer ) <***@griffith.edu.au>
--------------------------------------------------------------------------
Rosscott's Law: The faster the computer,
The faster it can go wrong.
--------------------------------------------------------------------------
Anthony Thyssen
2017-10-03 03:14:55 UTC
Permalink
Update... Problem of all processes runing on first node (oversubscribed
dual-core machine) is NOT resolved.

Changing the mpirun command in the Torque batch script ("pbs_hello" - See
previous) to

mpirun --nooversubscribe --display-allocation hostname

Then submitting to PBS/Torque using

qsub -l nodes=5:ppn=1:dualcore pbs_hello

To run on 5 dual-core machines. Produces the following result...

=======8<--------CUT HERE----------
PBS Job Number 8996
PBS batch run on node21.emperor
Time it was started 2017-10-03_13:04:07
Current Directory /net/shrek.emperor/home/shrek/anthony
Submitted work dir /home/shrek/anthony/mpi-pbs
Number of Nodes 5
Nodefile List /var/lib/torque/aux//8996.shrek.emperor
node21.emperor
node25.emperor
node24.emperor
node23.emperor
node22.emperor
---------------------------------------

====================== ALLOCATED NODES ======================
node21: slots=1 max_slots=0 slots_inuse=0 state=UP
node25.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node24.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node23.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node22.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
=================================================================
node21.emperor
node21.emperor
node21.emperor
node21.emperor
node21.emperor
=======8<--------CUT HERE----------

The $PBS_NODE file shows torque requesting 5 processes on 5 separate
machines.

The mpirun command's "ALLOCATED NODES" shows it picked up the request
correctly from torque.

But the "hostname" output still shows ALL processes were run on the first
node only!

Even though I requested it not to over subscribe.


I am at a complete loss as to how to solve this problem..

ANY and all suggestions, or even ways I can get other information as to
what is causing this will be most welcome.


Anthony Thyssen ( System Programmer ) <***@griffith.edu.au>
--------------------------------------------------------------------------
Using encryption on the Internet is the equivalent of arranging
an armored car to deliver credit-card information from someone
living in a cardboard box to someone living on a park bench.
-- Gene Spafford
--------------------------------------------------------------------------
r***@open-mpi.org
2017-10-03 04:00:34 UTC
Permalink
One thing I can see is that the local host (where mpirun executed) shows as “node21” in the allocation, while all others show their FQDN. This might be causing some confusion.

You might try adding "--mca orte_keep_fqdn_hostnames 1” to your cmd line and see if that helps.
Update... Problem of all processes runing on first node (oversubscribed dual-core machine) is NOT resolved.
Changing the mpirun command in the Torque batch script ("pbs_hello" - See previous) to
mpirun --nooversubscribe --display-allocation hostname
Then submitting to PBS/Torque using
qsub -l nodes=5:ppn=1:dualcore pbs_hello
To run on 5 dual-core machines. Produces the following result...
=======8<--------CUT HERE----------
PBS Job Number 8996
PBS batch run on node21.emperor
Time it was started 2017-10-03_13:04:07
Current Directory /net/shrek.emperor/home/shrek/anthony
Submitted work dir /home/shrek/anthony/mpi-pbs
Number of Nodes 5
Nodefile List /var/lib/torque/aux//8996.shrek.emperor
node21.emperor
node25.emperor
node24.emperor
node23.emperor
node22.emperor
---------------------------------------
====================== ALLOCATED NODES ======================
node21: slots=1 max_slots=0 slots_inuse=0 state=UP
node25.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node24.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node23.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node22.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
=================================================================
node21.emperor
node21.emperor
node21.emperor
node21.emperor
node21.emperor
=======8<--------CUT HERE----------
The $PBS_NODE file shows torque requesting 5 processes on 5 separate machines.
The mpirun command's "ALLOCATED NODES" shows it picked up the request correctly from torque.
But the "hostname" output still shows ALL processes were run on the first node only!
Even though I requested it not to over subscribe.
I am at a complete loss as to how to solve this problem..
ANY and all suggestions, or even ways I can get other information as to what is causing this will be most welcome.
--------------------------------------------------------------------------
Using encryption on the Internet is the equivalent of arranging
an armored car to deliver credit-card information from someone
living in a cardboard box to someone living on a park bench.
-- Gene Spafford
--------------------------------------------------------------------------
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Anthony Thyssen
2017-10-03 04:19:36 UTC
Permalink
I noticed that too. Though the submitting host for torque is a different
host (main head node, "shrek"), "node21" is the host that torque runs the
batch script (and the mpirun command) it being the first node in the
"dualcore" resource group.

Adding option...

It fixed the hostname in the allocation map, though had no effect on the
outcome. The allocation is still simply ignored.

=======8<--------CUT HERE----------
PBS Job Number 9000
PBS batch run on node21.emperor
Time it was started 2017-10-03_14:11:20
Current Directory /net/shrek.emperor/home/shrek/anthony
Submitted work dir /home/shrek/anthony/mpi-pbs
Number of Nodes 5
Nodefile List /var/lib/torque/aux//9000.shrek.emperor
node21.emperor
node25.emperor
node24.emperor
node23.emperor
node22.emperor
---------------------------------------

====================== ALLOCATED NODES ======================
node21.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node25.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node24.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node23.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node22.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
=================================================================
node21.emperor
node21.emperor
node21.emperor
node21.emperor
node21.emperor
=======8<--------CUT HERE----------


Anthony Thyssen ( System Programmer ) <***@griffith.edu.au>
--------------------------------------------------------------------------
The equivalent of an armoured car should always be used to
protect any secret kept in a cardboard box.
-- Anthony Thyssen, On the use of Encryption
--------------------------------------------------------------------------
Post by r***@open-mpi.org
One thing I can see is that the local host (where mpirun executed) shows
as “node21” in the allocation, while all others show their FQDN. This might
be causing some confusion.
You might try adding "--mca orte_keep_fqdn_hostnames 1” to your cmd line
and see if that helps.
Update... Problem of all processes runing on first node (oversubscribed
dual-core machine) is NOT resolved.
Changing the mpirun command in the Torque batch script ("pbs_hello" - See previous) to
mpirun --nooversubscribe --display-allocation hostname
Then submitting to PBS/Torque using
qsub -l nodes=5:ppn=1:dualcore pbs_hello
To run on 5 dual-core machines. Produces the following result...
=======8<--------CUT HERE----------
PBS Job Number 8996
PBS batch run on node21.emperor
Time it was started 2017-10-03_13:04:07
Current Directory /net/shrek.emperor/home/shrek/anthony
Submitted work dir /home/shrek/anthony/mpi-pbs
Number of Nodes 5
Nodefile List /var/lib/torque/aux//8996.shrek.emperor
node21.emperor
node25.emperor
node24.emperor
node23.emperor
node22.emperor
---------------------------------------
====================== ALLOCATED NODES ======================
node21: slots=1 max_slots=0 slots_inuse=0 state=UP
node25.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node24.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node23.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node22.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
=================================================================
node21.emperor
node21.emperor
node21.emperor
node21.emperor
node21.emperor
=======8<--------CUT HERE----------
The $PBS_NODE file shows torque requesting 5 processes on 5 separate machines.
The mpirun command's "ALLOCATED NODES" shows it picked up the request
correctly from torque.
But the "hostname" output still shows ALL processes were run on the first node only!
Even though I requested it not to over subscribe.
I am at a complete loss as to how to solve this problem..
ANY and all suggestions, or even ways I can get other information as to
what is causing this will be most welcome.
-----------------------------------------------------------
---------------
Using encryption on the Internet is the equivalent of arranging
an armored car to deliver credit-card information from someone
living in a cardboard box to someone living on a park bench.
-- Gene Spafford
-----------------------------------------------------------
---------------
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Gilles Gouaillardet
2017-10-03 04:39:28 UTC
Permalink
Anthony,


in your script, can you


set -x

env

pbsdsh hostname

mpirun --display-map --display-allocation --mca ess_base_verbose 10
--mca plm_base_verbose 10 --mca ras_base_verbose 10 hostname


and then compress and send the output ?


Cheers,


Gilles
I noticed that too.  Though the submitting host for torque is a
different host (main head node, "shrek"),  "node21" is the host that
torque runs the batch script (and the mpirun command) it being the
first node in the "dualcore" resource group.
Adding option...
It fixed the hostname in the allocation map, though had no effect on
the outcome.  The allocation is still simply ignored.
=======8<--------CUT HERE----------
PBS Job Number       9000
PBS batch run on     node21.emperor
Time it was started  2017-10-03_14:11:20
Current Directory    /net/shrek.emperor/home/shrek/anthony
Submitted work dir   /home/shrek/anthony/mpi-pbs
Number of Nodes      5
Nodefile List       /var/lib/torque/aux//9000.shrek.emperor
node21.emperor
node25.emperor
node24.emperor
node23.emperor
node22.emperor
---------------------------------------
======================  ALLOCATED NODES   ======================
node21.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node25.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node24.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node23.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node22.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
=================================================================
node21.emperor
node21.emperor
node21.emperor
node21.emperor
node21.emperor
=======8<--------CUT HERE----------
 --------------------------------------------------------------------------
   The equivalent of an armoured car should always be used to
   protect any secret kept in a cardboard box.
   -- Anthony Thyssen, On the use of Encryption
 --------------------------------------------------------------------------
One thing I can see is that the local host (where mpirun executed)
shows as “node21” in the allocation, while all others show their
FQDN. This might be causing some confusion.
You might try adding "--mca orte_keep_fqdn_hostnames 1” to your
cmd line and see if that helps.
On Oct 2, 2017, at 8:14 PM, Anthony Thyssen
Update...  Problem of all processes runing on first node
(oversubscribed dual-core machine) is NOT resolved.
Changing the mpirun  command in the Torque batch script
("pbs_hello" - See previous) to
   mpirun --nooversubscribe --display-allocation hostname
Then submitting to PBS/Torque using
qsub -l nodes=5:ppn=1:dualcore pbs_hello
To run on 5 dual-core machines. Produces the following result...
=======8<--------CUT HERE----------
PBS Job Number       8996
PBS batch run on  node21.emperor
Time it was started 2017-10-03_13:04:07
Current Directory /net/shrek.emperor/home/shrek/anthony
Submitted work dir  /home/shrek/anthony/mpi-pbs
Number of Nodes      5
Nodefile List /var/lib/torque/aux//8996.shrek.emperor
node21.emperor
node25.emperor
node24.emperor
node23.emperor
node22.emperor
---------------------------------------
======================  ALLOCATED NODES  ======================
        node21: slots=1 max_slots=0 slots_inuse=0 state=UP
        node25.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
        node24.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
        node23.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
        node22.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
=================================================================
node21.emperor
node21.emperor
node21.emperor
node21.emperor
node21.emperor
=======8<--------CUT HERE----------
The $PBS_NODE file shows torque requesting 5 processes on 5 separate machines.
The mpirun command's "ALLOCATED NODES" shows it picked up the
request correctly from torque.
But the "hostname" output still shows ALL processes were run on
the first node only!
Even though I requested it not to over subscribe.
I am at a complete loss as to how to solve this problem..
ANY and all suggestions, or even ways I can get other information
as to what is causing this will be most welcome.
  Anthony Thyssen ( System Programmer )   
 --------------------------------------------------------------------------
   Using encryption on the Internet is the equivalent of arranging
   an armored car to deliver credit-card information from someone
   living in a cardboard box to someone living on a park bench.
                       -- Gene Spafford
 --------------------------------------------------------------------------
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
<https://lists.open-mpi.org/mailman/listinfo/users>
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Anthony Thyssen
2017-10-03 05:02:19 UTC
Permalink
The stdin and stdout are saved to separate channels.

It is interesting that the output from pbsdsh is node21.emperor 5 times,
even though $PBS_NODES is the 5 individual nodes.

Attached are the two compressed files, as well as the pbs_hello batch used.

Anthony Thyssen ( System Programmer ) <***@griffith.edu.au>
--------------------------------------------------------------------------
There are two types of encryption:
One that will prevent your sister from reading your diary, and
One that will prevent your government. -- Bruce Schneier
--------------------------------------------------------------------------
Post by Gilles Gouaillardet
Anthony,
in your script, can you
set -x
env
pbsdsh hostname
mpirun --display-map --display-allocation --mca ess_base_verbose 10 --mca
plm_base_verbose 10 --mca ras_base_verbose 10 hostname
and then compress and send the output ?
Cheers,
Gilles
Post by Anthony Thyssen
I noticed that too. Though the submitting host for torque is a different
host (main head node, "shrek"), "node21" is the host that torque runs the
batch script (and the mpirun command) it being the first node in the
"dualcore" resource group.
Adding option...
It fixed the hostname in the allocation map, though had no effect on the
outcome. The allocation is still simply ignored.
=======8<--------CUT HERE----------
PBS Job Number 9000
PBS batch run on node21.emperor
Time it was started 2017-10-03_14:11:20
Current Directory /net/shrek.emperor/home/shrek/anthony
Submitted work dir /home/shrek/anthony/mpi-pbs
Number of Nodes 5
Nodefile List /var/lib/torque/aux//9000.shrek.emperor
node21.emperor
node25.emperor
node24.emperor
node23.emperor
node22.emperor
---------------------------------------
====================== ALLOCATED NODES ======================
node21.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node25.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node24.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node23.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node22.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
=================================================================
node21.emperor
node21.emperor
node21.emperor
node21.emperor
node21.emperor
=======8<--------CUT HERE----------
-----------------------------------------------------------
---------------
The equivalent of an armoured car should always be used to
protect any secret kept in a cardboard box.
-- Anthony Thyssen, On the use of Encryption
-----------------------------------------------------------
---------------
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Gilles Gouaillardet
2017-10-03 05:39:27 UTC
Permalink
Anthony,


we had a similar issue reported some times ago (e.g. Open MPI ignores
torque allocation),

and after quite some troubleshooting, we ended up with the same behavior
(e.g. pbsdsh is not working as expected).

see https://www.mail-archive.com/***@lists.open-mpi.org/msg29952.html
for the last email.


from an Open MPI point of view, i would consider the root cause is with
your torque install.

this case was reported at
http://www.clusterresources.com/pipermail/torqueusers/2016-September/018858.html

and no conclusion was reached.


Cheers,


Gilles
Post by Anthony Thyssen
The stdin and stdout are saved to separate channels.
It is interesting that the output from pbsdsh is node21.emperor 5
times, even though $PBS_NODES is the 5 individual nodes.
Attached are the two compressed files, as well as the pbs_hello batch used.
 --------------------------------------------------------------------------
    One that will prevent your sister from reading your diary, and
    One that will prevent your government.           -- Bruce Schneier
 --------------------------------------------------------------------------
Anthony,
in your script, can you
set -x
env
pbsdsh hostname
mpirun --display-map --display-allocation --mca ess_base_verbose
10 --mca plm_base_verbose 10 --mca ras_base_verbose 10 hostname
and then compress and send the output ?
Cheers,
Gilles
I noticed that too.  Though the submitting host for torque is
a different host (main head node, "shrek"),  "node21" is the
host that torque runs the batch script (and the mpirun
command) it being the first node in the "dualcore" resource group.
Adding option...
It fixed the hostname in the allocation map, though had no
effect on the outcome.  The allocation is still simply ignored.
=======8<--------CUT HERE----------
PBS Job Number       9000
PBS batch run on     node21.emperor
Time it was started  2017-10-03_14:11:20
Current Directory    /net/shrek.emperor/home/shrek/anthony
Submitted work dir   /home/shrek/anthony/mpi-pbs
Number of Nodes      5
Nodefile List       /var/lib/torque/aux//9000.shrek.emperor
node21.emperor
node25.emperor
node24.emperor
node23.emperor
node22.emperor
---------------------------------------
======================  ALLOCATED NODES  ======================
node21.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node25.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node24.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node23.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node22.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
=================================================================
node21.emperor
node21.emperor
node21.emperor
node21.emperor
node21.emperor
=======8<--------CUT HERE----------
  Anthony Thyssen ( System Programmer )   
 --------------------------------------------------------------------------
   The equivalent of an armoured car should always be used to
   protect any secret kept in a cardboard box.
   -- Anthony Thyssen, On the use of Encryption
 --------------------------------------------------------------------------
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
<https://lists.open-mpi.org/mailman/listinfo/users>
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
<https://lists.open-mpi.org/mailman/listinfo/users>
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Anthony Thyssen
2017-10-03 23:02:46 UTC
Permalink
Thank you Gilles. At least I now have something to follow though with.

As a FYI, the torque is the pre-built version from the Redhat Extras (EPEL)
archive.
torque-4.2.10-10.el7.x86_64

Normally pre-build packages have no problems, but in this case.
Post by Gilles Gouaillardet
Anthony,
we had a similar issue reported some times ago (e.g. Open MPI ignores
torque allocation),
and after quite some troubleshooting, we ended up with the same behavior
(e.g. pbsdsh is not working as expected).
for the last email.
from an Open MPI point of view, i would consider the root cause is with
your torque install.
this case was reported at http://www.clusterresources.co
m/pipermail/torqueusers/2016-September/018858.html
and no conclusion was reached.
Cheers,
Gilles
Post by Anthony Thyssen
The stdin and stdout are saved to separate channels.
It is interesting that the output from pbsdsh is node21.emperor 5 times,
even though $PBS_NODES is the 5 individual nodes.
Attached are the two compressed files, as well as the pbs_hello batch used.
-----------------------------------------------------------
---------------
One that will prevent your sister from reading your diary, and
One that will prevent your government. -- Bruce Schneier
-----------------------------------------------------------
---------------
Anthony,
in your script, can you
set -x
env
pbsdsh hostname
mpirun --display-map --display-allocation --mca ess_base_verbose
10 --mca plm_base_verbose 10 --mca ras_base_verbose 10 hostname
and then compress and send the output ?
Cheers,
Gilles
I noticed that too. Though the submitting host for torque is
a different host (main head node, "shrek"), "node21" is the
host that torque runs the batch script (and the mpirun
command) it being the first node in the "dualcore" resource group.
Adding option...
It fixed the hostname in the allocation map, though had no
effect on the outcome. The allocation is still simply ignored.
=======8<--------CUT HERE----------
PBS Job Number 9000
PBS batch run on node21.emperor
Time it was started 2017-10-03_14:11:20
Current Directory /net/shrek.emperor/home/shrek/anthony
Submitted work dir /home/shrek/anthony/mpi-pbs
Number of Nodes 5
Nodefile List /var/lib/torque/aux//9000.shrek.emperor
node21.emperor
node25.emperor
node24.emperor
node23.emperor
node22.emperor
---------------------------------------
====================== ALLOCATED NODES ======================
node21.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node25.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node24.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node23.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node22.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
=================================================================
node21.emperor
node21.emperor
node21.emperor
node21.emperor
node21.emperor
=======8<--------CUT HERE----------
Anthony Thyssen ( System Programmer )
-----------------------------------------------------------
---------------
The equivalent of an armoured car should always be used to
protect any secret kept in a cardboard box.
-- Anthony Thyssen, On the use of Encryption
-----------------------------------------------------------
---------------
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
<https://lists.open-mpi.org/mailman/listinfo/users>
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
<https://lists.open-mpi.org/mailman/listinfo/users>
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Anthony Thyssen
2017-10-04 00:33:13 UTC
Permalink
This post might be inappropriate. Click to display it.
r***@open-mpi.org
2017-10-04 01:43:56 UTC
Permalink
Can you try a newer version of OMPI, say the 3.0.0 release? Just curious to know if we perhaps “fixed” something relevant.
Post by Anthony Thyssen
FYI...
The problem is discussed further in
Redhat Bugzilla: Bug 1321154 - numa enabled torque don't work
https://bugzilla.redhat.com/show_bug.cgi?id=1321154 <https://bugzilla.redhat.com/show_bug.cgi?id=1321154>
I'd seen this previous as it required me to add "num_node_boards=1" to each node in the
/var/lib/torque/server_priv/nodes to get torque to at least work. Specifically by munging
the $PBS_NODES" (which comes out correcT) into a host list containing the correct
"slot=" counts. But of course now that I have compiled OpenMPI using "--with-tm" that
should not have been needed as in fact is now ignored by OpenMPI in a Torque-PBS
environment.
However it seems ever since "NUMA" support was into the Torque RPM's, has also caused
the current problems, and is still continuing. The last action is a new EPEL "test' version
(August 2017), I will try shortly.
Take you for your help, though I am still open to suggestions for a replacement.
--------------------------------------------------------------------------
Encryption... is a powerful defensive weapon for free people.
It offers a technical guarantee of privacy, regardless of who is
running the government... It's hard to think of a more powerful,
less dangerous tool for liberty. -- Esther Dyson
--------------------------------------------------------------------------
Thank you Gilles. At least I now have something to follow though with.
As a FYI, the torque is the pre-built version from the Redhat Extras (EPEL) archive.
torque-4.2.10-10.el7.x86_64
Normally pre-build packages have no problems, but in this case.
Anthony,
we had a similar issue reported some times ago (e.g. Open MPI ignores torque allocation),
and after quite some troubleshooting, we ended up with the same behavior (e.g. pbsdsh is not working as expected).
from an Open MPI point of view, i would consider the root cause is with your torque install.
this case was reported at http://www.clusterresources.com/pipermail/torqueusers/2016-September/018858.html <http://www.clusterresources.com/pipermail/torqueusers/2016-September/018858.html>
and no conclusion was reached.
Cheers,
Gilles
The stdin and stdout are saved to separate channels.
It is interesting that the output from pbsdsh is node21.emperor 5 times, even though $PBS_NODES is the 5 individual nodes.
Attached are the two compressed files, as well as the pbs_hello batch used.
--------------------------------------------------------------------------
One that will prevent your sister from reading your diary, and
One that will prevent your government. -- Bruce Schneier
--------------------------------------------------------------------------
Anthony,
in your script, can you
set -x
env
pbsdsh hostname
mpirun --display-map --display-allocation --mca ess_base_verbose
10 --mca plm_base_verbose 10 --mca ras_base_verbose 10 hostname
and then compress and send the output ?
Cheers,
Gilles
I noticed that too. Though the submitting host for torque is
a different host (main head node, "shrek"), "node21" is the
host that torque runs the batch script (and the mpirun
command) it being the first node in the "dualcore" resource group.
Adding option...
It fixed the hostname in the allocation map, though had no
effect on the outcome. The allocation is still simply ignored.
=======8<--------CUT HERE----------
PBS Job Number 9000
PBS batch run on node21.emperor
Time it was started 2017-10-03_14:11:20
Current Directory /net/shrek.emperor/home/shrek/anthony
Submitted work dir /home/shrek/anthony/mpi-pbs
Number of Nodes 5
Nodefile List /var/lib/torque/aux//9000.shrek.emperor
node21.emperor
node25.emperor
node24.emperor
node23.emperor
node22.emperor
---------------------------------------
====================== ALLOCATED NODES ======================
node21.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node25.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node24.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node23.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node22.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
=================================================================
node21.emperor
node21.emperor
node21.emperor
node21.emperor
node21.emperor
=======8<--------CUT HERE----------
Anthony Thyssen ( System Programmer )
--------------------------------------------------------------------------
The equivalent of an armoured car should always be used to
protect any secret kept in a cardboard box.
-- Anthony Thyssen, On the use of Encryption
--------------------------------------------------------------------------
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users <https://lists.open-mpi.org/mailman/listinfo/users>
<https://lists.open-mpi.org/mailman/listinfo/users <https://lists.open-mpi.org/mailman/listinfo/users>>
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users <https://lists.open-mpi.org/mailman/listinfo/users>
<https://lists.open-mpi.org/mailman/listinfo/users <https://lists.open-mpi.org/mailman/listinfo/users>>
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users <https://lists.open-mpi.org/mailman/listinfo/users>
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users <https://lists.open-mpi.org/mailman/listinfo/users>
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Anthony Thyssen
2017-10-06 06:22:04 UTC
Permalink
Sorry ***@open-mpi.org as Gilles Gouaillardet pointed out to me the
problem wasn't OpenMPI, but with the specific EPEL implementation (see
Redhat Bugzilla 1321154)

Today, the the server was able to be taken down for maintance, and I wanted
to try a few things.

After installing EPEL Testing Repo torque-4.2.10-11.el7

However I found that all the nodes were 'down' even though everything
appears to be running, with no errors in the error logs.

After a lot of trials, errors and reseach, I eventually (on a whim) I
decided to remove the "num_node_boards=1" entry from the
"torque/server_priv/nodes" file and restart the server & scheduler.
Suddenly the nodes were "free" and my initial test job ran.

Perhaps the EPEL-Test Torque 4.2.10-11 does not contain Numa?

ALL later tests (with OpenMPI - RHEL SRPM 1.10.6-2 re-compiled
"--with-tm") is now responding to the Torque mode allocation correctly and
is no longer simply running all the jobs on the first node.

That is $PBS_NODEFILE , pbsdsh hostname and mpirun hostname
are all in agreement.

Thank you all for your help, and putting up with with me.

Anthony Thyssen ( System Programmer ) <***@griffith.edu.au>
--------------------------------------------------------------------------
"Around here we've got a name for people what talks to dragons."
"Traitor?" Wiz asked apprehensively.
"No. Lunch." -- Rick Cook, "Wizadry Consulted"
--------------------------------------------------------------------------
Post by r***@open-mpi.org
Can you try a newer version of OMPI, say the 3.0.0 release? Just curious
to know if we perhaps “fixed” something relevant.
FYI...
The problem is discussed further in
Redhat Bugzilla: Bug 1321154 - numa enabled torque don't work
https://bugzilla.redhat.com/show_bug.cgi?id=1321154
I'd seen this previous as it required me to add "num_node_boards=1" to each node in the
/var/lib/torque/server_priv/nodes to get torque to at least work.
Specifically by munging
the $PBS_NODES" (which comes out correcT) into a host list containing the correct
"slot=" counts. But of course now that I have compiled OpenMPI using "--with-tm" that
should not have been needed as in fact is now ignored by OpenMPI in a Torque-PBS
environment.
However it seems ever since "NUMA" support was into the Torque RPM's, has also caused
the current problems, and is still continuing. The last action is a new
EPEL "test' version
(August 2017), I will try shortly.
Take you for your help, though I am still open to suggestions for a replacement.
-----------------------------------------------------------
---------------
Encryption... is a powerful defensive weapon for free people.
It offers a technical guarantee of privacy, regardless of who is
running the government... It's hard to think of a more powerful,
less dangerous tool for liberty. -- Esther Dyson
-----------------------------------------------------------
---------------
Post by Anthony Thyssen
Thank you Gilles. At least I now have something to follow though with.
As a FYI, the torque is the pre-built version from the Redhat Extras (EPEL) archive.
torque-4.2.10-10.el7.x86_64
Normally pre-build packages have no problems, but in this case.
Post by Gilles Gouaillardet
Anthony,
we had a similar issue reported some times ago (e.g. Open MPI ignores
torque allocation),
and after quite some troubleshooting, we ended up with the same behavior
(e.g. pbsdsh is not working as expected).
for the last email.
from an Open MPI point of view, i would consider the root cause is with
your torque install.
this case was reported at http://www.clusterresources.co
m/pipermail/torqueusers/2016-September/018858.html
and no conclusion was reached.
Cheers,
Gilles
Post by Anthony Thyssen
The stdin and stdout are saved to separate channels.
It is interesting that the output from pbsdsh is node21.emperor 5
times, even though $PBS_NODES is the 5 individual nodes.
Attached are the two compressed files, as well as the pbs_hello batch used.
-----------------------------------------------------------
---------------
One that will prevent your sister from reading your diary, and
One that will prevent your government. -- Bruce Schneier
-----------------------------------------------------------
---------------
Anthony,
in your script, can you
set -x
env
pbsdsh hostname
mpirun --display-map --display-allocation --mca ess_base_verbose
10 --mca plm_base_verbose 10 --mca ras_base_verbose 10 hostname
and then compress and send the output ?
Cheers,
Gilles
I noticed that too. Though the submitting host for torque is
a different host (main head node, "shrek"), "node21" is the
host that torque runs the batch script (and the mpirun
command) it being the first node in the "dualcore" resource group.
Adding option...
It fixed the hostname in the allocation map, though had no
effect on the outcome. The allocation is still simply ignored.
=======8<--------CUT HERE----------
PBS Job Number 9000
PBS batch run on node21.emperor
Time it was started 2017-10-03_14:11:20
Current Directory /net/shrek.emperor/home/shrek/anthony
Submitted work dir /home/shrek/anthony/mpi-pbs
Number of Nodes 5
Nodefile List /var/lib/torque/aux//9000.shrek.emperor
node21.emperor
node25.emperor
node24.emperor
node23.emperor
node22.emperor
---------------------------------------
====================== ALLOCATED NODES ======================
node21.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node25.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node24.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node23.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node22.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
============================================================
=====
node21.emperor
node21.emperor
node21.emperor
node21.emperor
node21.emperor
=======8<--------CUT HERE----------
Anthony Thyssen ( System Programmer )
-----------------------------------------------------------
---------------
The equivalent of an armoured car should always be used to
protect any secret kept in a cardboard box.
-- Anthony Thyssen, On the use of Encryption
-----------------------------------------------------------
---------------
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
<https://lists.open-mpi.org/mailman/listinfo/users>
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
<https://lists.open-mpi.org/mailman/listinfo/users>
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
r***@open-mpi.org
2017-10-06 09:45:43 UTC
Permalink
No problem - glad you were able to work it out!
Today, the the server was able to be taken down for maintance, and I wanted to try a few things.
After installing EPEL Testing Repo torque-4.2.10-11.el7
However I found that all the nodes were 'down' even though everything appears to be running, with no errors in the error logs.
After a lot of trials, errors and reseach, I eventually (on a whim) I decided to remove the "num_node_boards=1" entry from the "torque/server_priv/nodes" file and restart the server & scheduler. Suddenly the nodes were "free" and my initial test job ran.
Perhaps the EPEL-Test Torque 4.2.10-11 does not contain Numa?
ALL later tests (with OpenMPI - RHEL SRPM 1.10.6-2 re-compiled "--with-tm") is now responding to the Torque mode allocation correctly and is no longer simply running all the jobs on the first node.
That is $PBS_NODEFILE , pbsdsh hostname and mpirun hostname are all in agreement.
Thank you all for your help, and putting up with with me.
--------------------------------------------------------------------------
"Around here we've got a name for people what talks to dragons."
"Traitor?" Wiz asked apprehensively.
"No. Lunch." -- Rick Cook, "Wizadry Consulted"
--------------------------------------------------------------------------
Can you try a newer version of OMPI, say the 3.0.0 release? Just curious to know if we perhaps “fixed” something relevant.
Post by Anthony Thyssen
FYI...
The problem is discussed further in
Redhat Bugzilla: Bug 1321154 - numa enabled torque don't work
https://bugzilla.redhat.com/show_bug.cgi?id=1321154 <https://bugzilla.redhat.com/show_bug.cgi?id=1321154>
I'd seen this previous as it required me to add "num_node_boards=1" to each node in the
/var/lib/torque/server_priv/nodes to get torque to at least work. Specifically by munging
the $PBS_NODES" (which comes out correcT) into a host list containing the correct
"slot=" counts. But of course now that I have compiled OpenMPI using "--with-tm" that
should not have been needed as in fact is now ignored by OpenMPI in a Torque-PBS
environment.
However it seems ever since "NUMA" support was into the Torque RPM's, has also caused
the current problems, and is still continuing. The last action is a new EPEL "test' version
(August 2017), I will try shortly.
Take you for your help, though I am still open to suggestions for a replacement.
--------------------------------------------------------------------------
Encryption... is a powerful defensive weapon for free people.
It offers a technical guarantee of privacy, regardless of who is
running the government... It's hard to think of a more powerful,
less dangerous tool for liberty. -- Esther Dyson
--------------------------------------------------------------------------
Thank you Gilles. At least I now have something to follow though with.
As a FYI, the torque is the pre-built version from the Redhat Extras (EPEL) archive.
torque-4.2.10-10.el7.x86_64
Normally pre-build packages have no problems, but in this case.
Anthony,
we had a similar issue reported some times ago (e.g. Open MPI ignores torque allocation),
and after quite some troubleshooting, we ended up with the same behavior (e.g. pbsdsh is not working as expected).
from an Open MPI point of view, i would consider the root cause is with your torque install.
this case was reported at http://www.clusterresources.com/pipermail/torqueusers/2016-September/018858.html <http://www.clusterresources.com/pipermail/torqueusers/2016-September/018858.html>
and no conclusion was reached.
Cheers,
Gilles
The stdin and stdout are saved to separate channels.
It is interesting that the output from pbsdsh is node21.emperor 5 times, even though $PBS_NODES is the 5 individual nodes.
Attached are the two compressed files, as well as the pbs_hello batch used.
--------------------------------------------------------------------------
One that will prevent your sister from reading your diary, and
One that will prevent your government. -- Bruce Schneier
--------------------------------------------------------------------------
Anthony,
in your script, can you
set -x
env
pbsdsh hostname
mpirun --display-map --display-allocation --mca ess_base_verbose
10 --mca plm_base_verbose 10 --mca ras_base_verbose 10 hostname
and then compress and send the output ?
Cheers,
Gilles
I noticed that too. Though the submitting host for torque is
a different host (main head node, "shrek"), "node21" is the
host that torque runs the batch script (and the mpirun
command) it being the first node in the "dualcore" resource group.
Adding option...
It fixed the hostname in the allocation map, though had no
effect on the outcome. The allocation is still simply ignored.
=======8<--------CUT HERE----------
PBS Job Number 9000
PBS batch run on node21.emperor
Time it was started 2017-10-03_14:11:20
Current Directory /net/shrek.emperor/home/shrek/anthony
Submitted work dir /home/shrek/anthony/mpi-pbs
Number of Nodes 5
Nodefile List /var/lib/torque/aux//9000.shrek.emperor
node21.emperor
node25.emperor
node24.emperor
node23.emperor
node22.emperor
---------------------------------------
====================== ALLOCATED NODES ======================
node21.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node25.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node24.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node23.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
node22.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
=================================================================
node21.emperor
node21.emperor
node21.emperor
node21.emperor
node21.emperor
=======8<--------CUT HERE----------
Anthony Thyssen ( System Programmer )
--------------------------------------------------------------------------
The equivalent of an armoured car should always be used to
protect any secret kept in a cardboard box.
-- Anthony Thyssen, On the use of Encryption
--------------------------------------------------------------------------
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users <https://lists.open-mpi.org/mailman/listinfo/users>
<https://lists.open-mpi.org/mailman/listinfo/users <https://lists.open-mpi.org/mailman/listinfo/users>>
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users <https://lists.open-mpi.org/mailman/listinfo/users>
<https://lists.open-mpi.org/mailman/listinfo/users <https://lists.open-mpi.org/mailman/listinfo/users>>
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users <https://lists.open-mpi.org/mailman/listinfo/users>
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users <https://lists.open-mpi.org/mailman/listinfo/users>
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users <https://lists.open-mpi.org/mailman/listinfo/users>
Gus Correa
2017-10-06 14:58:31 UTC
Permalink
Hi Anthony, Ralph, Gilles, all

As far as I know, for core/processor assignment to user jobs to work,
Torque needs to be configured with cpuset support
(configure --enable-cpuset ...).
That is separate from what OpenMPI does in terms of process binding.
Otherwise, the user processes in the job
will be free to use any cores/processors on the nodes assigned to it.

Some additional work to setup Linux support for cpuset is also needed,
for Torque to use it at runtime (create a subdirectory in /dev/cpuset,
mount the cpuset file system there).
I do this in the pbs_mom daemon startup stcript,
but that can be done in other ways:

##################################################
# create and mount /dev/cpuset
if [ ! -e /dev/cpuset ];then
mkdir /dev/cpuset
fi

if [ "`mount -t cpuset`x" == "x" ];then
mount -t cpuset none /dev/cpuset
fi
##################################################

I don't know if the epel Torque package is configured
with cpuset support, but I would guess it is not.
Look at /dev/cpuset in your compute nodes
to see if Torque created anything there.

I don't know either if OpenMPI can somehow bypass the cores/processors
assigned by torque to a job, if any, or when Torque is configured
without cpuset support, to somehow still bind the MPI processes to
cores/processors/sockets/etc.

I hope this helps,
Gus Correa
pointed out to me the problem wasn't OpenMPI, but with the specific EPEL
implementation (see Redhat Bugzilla 1321154)
Today, the the server was able to be taken down for maintance, and I
wanted to try a few things.
After installing EPEL Testing Repo    torque-4.2.10-11.el7
However I found that all the nodes were 'down'  even though everything
appears to be running, with no errors in the error logs.
After a lot of trials, errors and reseach, I eventually (on a whim) I
decided to remove the "num_node_boards=1" entry from the
"torque/server_priv/nodes" file and restart the server & scheduler.
 Suddenly the nodes were "free" and my initial test job ran.
Perhaps the EPEL-Test Torque 4.2.10-11  does not contain Numa?
ALL later tests (with OpenMPI - RHEL SRPM 1.10.6-2 re-compiled
"--with-tm")  is now responding to the Torque mode allocation correctly
and is no longer simply running all the jobs on the first node.
That is    $PBS_NODEFILE  ,    pbsdsh hostname  and   mpirun hostname
are all in agreement.
Thank you all for your help, and putting up with with me.
 --------------------------------------------------------------------------
  "Around here we've got a name for people what talks to dragons."
  "Traitor?"  Wiz asked apprehensively.
  "No.  Lunch."                     -- Rick Cook, "Wizadry Consulted"
 --------------------------------------------------------------------------
Can you try a newer version of OMPI, say the 3.0.0 release? Just
curious to know if we perhaps “fixed” something relevant.
On Oct 3, 2017, at 5:33 PM, Anthony Thyssen
FYI...
The problem is discussed further in
Redhat Bugzilla: Bug 1321154 - numa enabled torque don't work
https://bugzilla.redhat.com/show_bug.cgi?id=1321154
<https://bugzilla.redhat.com/show_bug.cgi?id=1321154>
I'd seen this previous as it required me to add
"num_node_boards=1" to each node in the
/var/lib/torque/server_priv/nodes  to get torque to at least
work.  Specifically by munging
the $PBS_NODES" (which comes out correcT) into a host list containing the correct
"slot=" counts.  But of course now that I have compiled OpenMPI
using "--with-tm" that
should not have been needed as in fact is now ignored by OpenMPI in a Torque-PBS
environment.
However it seems ever since "NUMA" support was into the Torque
RPM's, has also caused
the current problems, and is still continuing.   The last action
is a new EPEL "test' version
(August 2017),  I will try shortly.
Take you for your help, though I am still open to suggestions for a replacement.
  Anthony Thyssen ( System Programmer )
 --------------------------------------------------------------------------
   Encryption... is a powerful defensive weapon for free people.
   It offers a technical guarantee of privacy, regardless of who is
   running the government... It's hard to think of a more powerful,
   less dangerous tool for liberty.            --  Esther Dyson
 --------------------------------------------------------------------------
On Wed, Oct 4, 2017 at 9:02 AM, Anthony Thyssen
Thank you Gilles.  At least I now have something to follow
though with.
As a FYI, the torque is the pre-built version from the Redhat
Extras (EPEL) archive.
torque-4.2.10-10.el7.x86_64
Normally pre-build packages have no problems, but in this case.
On Tue, Oct 3, 2017 at 3:39 PM, Gilles Gouaillardet
Anthony,
we had a similar issue reported some times ago (e.g. Open
MPI ignores torque allocation),
and after quite some troubleshooting, we ended up with the
same behavior (e.g. pbsdsh is not working as expected).
see
for the last email.
from an Open MPI point of view, i would consider the root
cause is with your torque install.
this case was reported at
http://www.clusterresources.com/pipermail/torqueusers/2016-September/018858.html
<http://www.clusterresources.com/pipermail/torqueusers/2016-September/018858.html>
and no conclusion was reached.
Cheers,
Gilles
The stdin and stdout are saved to separate channels.
It is interesting that the output from pbsdsh is
node21.emperor 5 times, even though $PBS_NODES is the
5 individual nodes.
Attached are the two compressed files, as well as the
pbs_hello batch used.
Anthony Thyssen ( System Programmer )
 --------------------------------------------------------------------------
    One that will prevent your sister from reading
your diary, and
    One that will prevent your government.
 -- Bruce Schneier
 --------------------------------------------------------------------------
On Tue, Oct 3, 2017 at 2:39 PM, Gilles Gouaillardet
    Anthony,
    in your script, can you
    set -x
    env
    pbsdsh hostname
    mpirun --display-map --display-allocation --mca
ess_base_verbose
    10 --mca plm_base_verbose 10 --mca
ras_base_verbose 10 hostname
    and then compress and send the output ?
    Cheers,
    Gilles
        I noticed that too.  Though the submitting
host for torque is
        a different host (main head node, "shrek"),
"node21" is the
        host that torque runs the batch script (and
the mpirun
        command) it being the first node in the
"dualcore" resource group.
        Adding option...
        It fixed the hostname in the allocation map,
though had no
        effect on the outcome.  The allocation is
still simply ignored.
        =======8<--------CUT HERE----------
        PBS Job Number       9000
        PBS batch run on     node21.emperor
        Time it was started  2017-10-03_14:11:20
        Current Directory
/net/shrek.emperor/home/shrek/anthony
        Submitted work dir   /home/shrek/anthony/mpi-pbs
        Number of Nodes      5
        Nodefile List
/var/lib/torque/aux//9000.shrek.emperor
        node21.emperor
        node25.emperor
        node24.emperor
        node23.emperor
        node22.emperor
        ---------------------------------------
        ======================  ALLOCATED NODES
 ======================
        node21.emperor: slots=1 max_slots=0
slots_inuse=0 state=UP
        node25.emperor: slots=1 max_slots=0
slots_inuse=0 state=UP
        node24.emperor: slots=1 max_slots=0
slots_inuse=0 state=UP
        node23.emperor: slots=1 max_slots=0
slots_inuse=0 state=UP
        node22.emperor: slots=1 max_slots=0
slots_inuse=0 state=UP
=================================================================
        node21.emperor
        node21.emperor
        node21.emperor
        node21.emperor
        node21.emperor
        =======8<--------CUT HERE----------
          Anthony Thyssen ( System Programmer )
 --------------------------------------------------------------------------
           The equivalent of an armoured car should
always be used to
           protect any secret kept in a cardboard box.
           -- Anthony Thyssen, On the use of Encryption
 --------------------------------------------------------------------------
        _______________________________________________
        users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
<https://lists.open-mpi.org/mailman/listinfo/users>
<https://lists.open-mpi.org/mailman/listinfo/users
<https://lists.open-mpi.org/mailman/listinfo/users>>
    _______________________________________________
    users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
<https://lists.open-mpi.org/mailman/listinfo/users>
    <https://lists.open-mpi.org/mailman/listinfo/users
<https://lists.open-mpi.org/mailman/listinfo/users>>
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
<https://lists.open-mpi.org/mailman/listinfo/users>
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
<https://lists.open-mpi.org/mailman/listinfo/users>
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
<https://lists.open-mpi.org/mailman/listinfo/users>
_______________________________________________
users mailing list
https://lists.open-mpi.org/mailman/listinfo/users
Loading...