Mahmood Naderan
2016-10-01 08:02:36 UTC
Hi,
Here is the bizarre behavior of the system and hope that someone can
clarify is this related to OMPI or not.
When I issue the mpirun command with -np 2, I can see the output of the
program online as it is running (I am std out). However, if I issue the
command with -np 4, the progress is not shown!!
Please see the output below. I ran 'date' command first and the issued the
command with '-np 4'. After some seconds, I pressed ^C and ran 'date'
again. As you can see, there is no output information. Next, I ran with
'-np 2' and after a while I pressed ^C. You can see that the progress of
the program is shown.
***@cluster:A4$ date
Sat Oct 1 11:26:13 2016
***@cluster:A4$ /share/apps/computer/openmpi-2.0.0/bin/mpirun
--hostfile hosts.txt -np 4 /share/apps/chemistry/siesta-4.0/spar/siesta <
A.fdf
Siesta Version: siesta-4.0--500
Architecture : x86_64-unknown-linux-gnu--unknown
Compiler flags: /share/apps/computer/openmpi-2.0.0/bin/mpifort
PP flags : -DMPI -DFC_HAVE_FLUSH -DFC_HAVE_ABORT
PARALLEL version
* Running on 4 nodes in parallel
* WELCOME TO SIESTA *
***********************
reinit: Reading from standard input
************************** Dump of input data file
****************************
^CKilled by signal 2.
***@cluster:A4$ date
Sat Oct 1 11:26:30 2016
***@cluster:A4$ /share/apps/computer/openmpi-2.0.0/bin/mpirun
--hostfile hosts.txt -np 2 /share/apps/chemistry/siesta-4.0/spar/siesta <
A.fdf
Siesta Version: siesta-4.0--500
Architecture : x86_64-unknown-linux-gnu--unknown
Compiler flags: /share/apps/computer/openmpi-2.0.0/bin/mpifort
PP flags : -DMPI -DFC_HAVE_FLUSH -DFC_HAVE_ABORT
PARALLEL version
* Running on 2 nodes in parallel
* WELCOME TO SIESTA *
***********************
reinit: Reading from standard input
************************** Dump of input data file
****************************
SystemLabel A
NumberOfAtoms 54
NumberOfSpecies 2
%block ChemicalSpeciesLabel
...
...
...
^CKilled by signal 2.
***@cluster:A4$ date
Sat Oct 1 11:26:38 2016
Any idea about that? The problem occurs when I change the MPI's switches.
Regards,
Mahmood
Here is the bizarre behavior of the system and hope that someone can
clarify is this related to OMPI or not.
When I issue the mpirun command with -np 2, I can see the output of the
program online as it is running (I am std out). However, if I issue the
command with -np 4, the progress is not shown!!
Please see the output below. I ran 'date' command first and the issued the
command with '-np 4'. After some seconds, I pressed ^C and ran 'date'
again. As you can see, there is no output information. Next, I ran with
'-np 2' and after a while I pressed ^C. You can see that the progress of
the program is shown.
***@cluster:A4$ date
Sat Oct 1 11:26:13 2016
***@cluster:A4$ /share/apps/computer/openmpi-2.0.0/bin/mpirun
--hostfile hosts.txt -np 4 /share/apps/chemistry/siesta-4.0/spar/siesta <
A.fdf
Siesta Version: siesta-4.0--500
Architecture : x86_64-unknown-linux-gnu--unknown
Compiler flags: /share/apps/computer/openmpi-2.0.0/bin/mpifort
PP flags : -DMPI -DFC_HAVE_FLUSH -DFC_HAVE_ABORT
PARALLEL version
* Running on 4 nodes in parallel
Start of run: 1-OCT-2016 11:26:23
************************ WELCOME TO SIESTA *
***********************
reinit: Reading from standard input
************************** Dump of input data file
****************************
^CKilled by signal 2.
***@cluster:A4$ date
Sat Oct 1 11:26:30 2016
***@cluster:A4$ /share/apps/computer/openmpi-2.0.0/bin/mpirun
--hostfile hosts.txt -np 2 /share/apps/chemistry/siesta-4.0/spar/siesta <
A.fdf
Siesta Version: siesta-4.0--500
Architecture : x86_64-unknown-linux-gnu--unknown
Compiler flags: /share/apps/computer/openmpi-2.0.0/bin/mpifort
PP flags : -DMPI -DFC_HAVE_FLUSH -DFC_HAVE_ABORT
PARALLEL version
* Running on 2 nodes in parallel
Start of run: 1-OCT-2016 11:26:36
************************ WELCOME TO SIESTA *
***********************
reinit: Reading from standard input
************************** Dump of input data file
****************************
SystemLabel A
NumberOfAtoms 54
NumberOfSpecies 2
%block ChemicalSpeciesLabel
...
...
...
^CKilled by signal 2.
***@cluster:A4$ date
Sat Oct 1 11:26:38 2016
Any idea about that? The problem occurs when I change the MPI's switches.
Regards,
Mahmood