[Pw_forum] "MPI_COMM_RANK : Null communicator..." error through Platform LSF system

wangxinquan at tju.edu.cn wangxinquan at tju.edu.cn
Wed Apr 9 05:50:07 CEST 2008

Dear users and developers,

     Recently I have done a test on Nankai Stars HPC. The error message 
"MPI_COMM_RANK : Null communicator¡­Aborting program !"appeared when I did 
a scf calculation through 2 cpu (2nodes). 

     To solve this problem, I have found some hints from google, such as¡°please
make sure that you used the same version of MPI for compiling and running, and
included the corresponding header file mpi.h in your code.¡± 

     According to the pwscf mailing list,"dynamic port number used in mpi
intercommunication is not working. This is most probably an installation issue
regarding LSF." may be a problem. 

     According to the pwscf manual,"Your machine might be configured so as to 
disallow interactive execution" may be another problem.

     My question is:
     To solve ¡°MPI_COMM_RANK¡­¡± problem, do I need to modify pwscf code,
mpich_gm code or LSF system?

Calculation Details are as follows:
HPC background:
Nankai Stars (
800 Xeon 3.06 Ghz CPU (400 nodes)   
800 GB Memory    
53T High-Speed Storage    
Parallel jobs are run and debuged through Platform LSF system.
Mpich_gm driver:1.2.6..13a

/configure CC=mpicc F77=mpif77 F90=mpif90
make all

Submit script :
#BSUB -q normal
#BSUB -J test.icymoon
#BSUB -c 3:00
#BSUB -a "mpich_gm"
#BSUB -o %J.log
#BSUB -n 2 

cd /nfs/s04r2p1/wangxq_tj
echo "test icymoon"

mpirun.lsf /nfs/s04r2p1/wangxq_tj/espresso-3.2.3/bin/pw.x <
/nfs/s04r2p1/wangxq_tj/cu.scf.in > cu.scf.out

echo "test icymoon end"

Output file (%J.log):

¡­ ¡­
The output (if any) follows:

test icymoon
0 - MPI_COMM_RANK : Null communicator
[0]  Aborting program !
[0] Aborting program!
test icymoon end


    pseudo_dir = '/nfs/s04r2p1/wangxq_tj/espresso-3.2.3/pseudo/',


    ibrav = 2, celldm(1) =6.73, nat= 1, ntyp= 1,
    ecutwfc = 25.0, ecutrho = 300.0
    occupations='smearing', smearing='methfessel-paxton', degauss=0.02
    noncolin = .true.
    starting_magnetization(1) = 0.5
    angle1(1) = 90.0
    angle2(1) =  0.0


    conv_thr = 1.0e-8
    mixing_beta = 0.7 

 Cu 63.55 Cu.pz-d-rrkjus.UPF
 Cu 0.0 0.0 0.0
K_POINTS (automatic)
 8 8 8 0 0 0


1 - MPI_COMM_RANK : Null communicator
[1]  Aborting program !
[1] Aborting program!


==== ========== ================  =======================  ===================

0001 node333                      Exit (255)               04/08/2008 19:36:59

0002 node284                      Exit (255)               04/08/2008 19:36:59


Any help will be deeply appreciated!

Best regards,


X.Q. Wang 

wangxinquan at tju.edu.cn

School of Chemical Engineering and Technology

Tianjin University

92 Weijin Road, Tianjin, P. R. China

tel:86-22-27890268, fax: 86-22-27892301


More information about the Pw_forum mailing list