[Pw_forum] Problems running PWscf in parallel

stewart at cnf.cornell.edu stewart at cnf.cornell.edu
Tue Nov 1 18:12:32 CET 2005


Hi all, 

  I am having problems getting PWscf to run in parallel on a red-hat linux 
cluster using lam-mpi.  When I start the calculation, I get the standard 
header information along with the number of processors that will be used.  
Then the program appears to hang even on simple calculations.  I have tried 
running it in directories that can be seen by NFS and also on scratch 
directories outside the NFS with no luck. 

 I have set the runs up with a hostfile for 4 nodes with 2 processors each.  
Using both: 

mpirun -np 8 pw.x < run.in > run.out 

mpirun C pw.x < run.in > run.out 

I run into the same problem. 

When the program starts it tries to start 2 processes on 4 different nodes.  
Shortly after this, the second process on each node quits and I am left with 
4 pw.x runs on 4 nodes without any output.  Memory usage of these processors 
is less than 1% so I know they aren't doing any significant work. 

I would appreciate any suggestions that people have on why some pw.x 
processes are falling out would be greatly appreciated! 

Thanks, 

Derek 

#################################
Derek Stewart, Ph. D.
Scientific Computation Associate
250 Duffield Hall
Cornell Nanoscale Facility (CNF)
Ithaca, NY 14853
stewart (at) cnf.cornell.edu
(607) 255-2856 





More information about the Pw_forum mailing list