Fw: [Pw_forum] Re: Woodcrest vs Opteron performance in pwscf calc.

Huiqun Zhou hqzhou at nju.edu.cn
Fri Aug 11 04:18:37 CEST 2006


It seems that maintanence work of our campus network messed up our mail 
server
these two days. My reply to the pw forum is in my "Sent" box, but I haven't 
seen
its arrival in the mailing list although three days have passed.

Here I forward my post. if you by any means had received the mail 2 days 
ago, I
apologize for the inconvenience to you.

Huiqun Zhou

----- Original Message ----- 
From: "Huiqun Zhou" <hqzhou at nju.edu.cn>
To: <pw_forum at pwscf.org>
Sent: Wednesday, August 09, 2006 7:22 PM
Subject: Re: [Pw_forum] Re: Woodcrest vs Opteron performance in pwscf calc.


> Kostya and list-users,
>
> Thanks for your comment and recommendation. Considering the 
> price/performance
> of machines with dempsey and woodcrest, dempsey may be a good choice too,
> especially if you have no problem to pay the electricity bill ;-)
>
> You mentioned NUMA enabling on opteron machines, I wonder if it's a 
> default
> function of kernel 2.6.9-xx. If it's not, I need to turn it on in 
> re-configuration of
> the kernel and recompile, right?
>
> Thanks,
>
> Huiqun
>
> ----- Original Message ----- 
> From: "Konstantin Kudin" <konstantin_kudin at yahoo.com>
> To: <pw_forum at pwscf.org>
> Sent: Tuesday, August 08, 2006 12:00 AM
> Subject: Re: [Pw_forum] Re: Woodcrest vs Opteron performance in pwscf 
> calc.
>
>
>>
>> Dempsey and Opterons do 2 BLAS operations per cycle, while Woodcrest
>> does 4. So effectively you get these frequencies for BLAS (per core):
>> Woodcrest (4x2.66=10.6), Dempsey (3.2x2=6.4), Opteron ( 2.6x2=5.2).
>> That is exactly the order you get in terms of performance. Your Opteron
>> scaling is not too good, which either suggests that there is not enough
>> memory bandwidth, or you do not have NUMA turned on.
>>
>> Now, the theoretical performance would translate into the real world
>> if the memory is fast enough. I think both Dempsey and Woodcrest use
>> the same chipset with 2 buses, so earlier memory contention issues with
>> multiple Intel chips are mostly gone for now. Still, you see that with
>> 4 Woodcrest cores the speedups are worse then for Dempsey, which
>> suggests that perhaps the optimal purchase for QE would be lower
>> frequency chips, such as 2.0 or 2.33 Ghz since 4 2.66 Ghz cores are too
>> fast for the memory.
>>
>> Kostya
>>
> 




More information about the Pw_forum mailing list