<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body text="#000000" bgcolor="#ffffff">

    Hi Victor,<br>

    <br>

    benchmark and results will be presented at next <a

      href="http://www.supercomp.de/isc11/">ISC2011</a>.<br>

    We have an agreement with NVIDIA as they want show our work on PWscf

    code through a poster at their booth.<br>

    Benchmarks need to be put together and analyzed before to be

    presented. <br>

    Then with the close deadline we'll bring number of upgrade in these

    days. <br>

    You can however download the code and test it your own updating

    frequently your source directory keeping it aligned with the SVN

    repository.<br>

    <br>

    I will personally send you some number on your personal address mail

    early next week in excel format since you looked particularly

    interested.<br>

    <br>

    Please, let's move the discussion about PWscf GPU version on the

    dedicated lists. We don't want spam pw_forum's member that are not

    interested at this topic.<br>

    <br>

    Best regards,<br>

    <br>

    Ivan Girotto <br>

    <br>

    <br>

    On 05/05/2011 17:24, Vit wrote:

    <blockquote cite="mid:201105052324.54165.vitruss@gmail.com"

      type="cite">

      <pre wrap="">Dear Ivan,

Can you please proveide any benchmarks and comparison of Hybrid CPU/GPU vs 

pure CPU computation? 

With best regards,

Victor.

Ivan Girotto <a class="moz-txt-link-rfc2396E" href="mailto:ivan.girotto@ichec.ie">&lt;ivan.girotto@ichec.ie&gt;</a>

Thursday 05 May 2011

</pre>

      <blockquote type="cite">

        <pre wrap="">Dear QE users &amp; developers,

We are happy to announce that the first beta GPU-enabled release of

Quantum ESPRESSO (QE) has been committed today in the official repository.

You can download the new version of the code using the following command:

$ svn checkout

svn://scm.qe-forge.org/scmrepos/svn/q-e/branches/espresso-PRACE

The Irish Centre for High-End Computing (ICHEC, <a class="moz-txt-link-abbreviated" href="http://www.ichec.ie">www.ichec.ie</a>

<a class="moz-txt-link-rfc2396E" href="http://www.ichec.ie">&lt;http://www.ichec.ie&gt;</a>) has been mainly responsible for extending the QE

suite to enhance calculations on NVIDIA GPUs. The porting activity has

been supported within the PRACE 1st Implementation Phase project. It is

currently carried out through the Sub-task "Accelerator", led by ICHEC,

within the Work-Package "Programming Techniques for High-Performance

Applications" in collaboration with CINECA.

The porting activity is concerned mainly with the PWscf package. But

ICHEC and the Irish QE user community are interested in exploring any

other initiatives which come forward from QE users or developers

interested in porting on GPGPU architecture any of the QE suite related

codes.

We have successfully accelerated the linear algebra part of the QE suite

using a library called phiGEMM, some explicit computational kernels

(newd, addusdense, vloc_psi) and the 3D FFT for the single CPU/GPU

version. Both linear algebra (matrix multiplication) and the FFT

accelerated version make use of CUDA libraries. The porting is mainly

based on wrappers that permit the use of libraries for accelerators. The

distributed 3D FFT version is still in progress, since this porting

requires important changes of the current structure of the code and data

distribution. While running the parallel and distributed multi-GPUs

version it still uses the original 3D FFT implementations.

The phiGEMM library is distributed as an independent open-source

external package together with the Quantum ESPRESSO suite. It aims to

perform matrix multiplication ([SDZ]GEMM) taking advantage of the

underlying BLAS kernel functions on both CPU and NVIDIA CUDA-based GPU,

mixing and overlapping computation between the host (CPU) and the

accelerator (GPU). Whatever code makes intensive use of GEMM it can

derive a significant advantage linking this library when running on a

CPU/GPU hybrid system.

Even if the 3D FFT is accelerated only for a single CPU process (not

when using MPI), other parts are already enabled to take advantage of a

distributed parallel hybrid system. All of this allows PWscf to

potentially use distributed message-passing parallelization (MPI) plus

multi-threading (OpenMP) plus accelerators (NVIDIA GPUs) all together

and produce good performance enhancement using the latest version of

NVIDIA GPUs (high performance double precision is needed). This porting

activity is still in progress, especially the parallel 3D FFT component

that represents a bottleneck for large calculations. We have been

testing this beta release using some small/medium benchmarks used in the

DEISA official bench-suite and several GPU hardware (Tesla and Fermi

architectures). Special thanks goes to both E4 Computer Engineering and

CEA for providing access to hybrid GPU systems with differing

configurations to those available at ICHEC.

We look forward with interest to receiving any suggestions for

improvement, feedback or request for collaboration by anyone who is

interested to try and validate PWscf CUDA version on different platforms

using different scientific cases.For additional information please

contact <a class="moz-txt-link-abbreviated" href="mailto:qe-gpu@ichec.ie">qe-gpu@ichec.ie</a> or replay at this mail. We'll be shortly

available a dedicated forum <a class="moz-txt-link-abbreviated" href="mailto:q-e-gpgpu@qe-forge.org">q-e-gpgpu@qe-forge.org</a>

<a class="moz-txt-link-rfc2396E" href="http://qe-forge.org/mail/?group_id=10">&lt;http://qe-forge.org/mail/?group_id=10&gt;</a>. Please use this list for bug

report and any other issue related to the use of the PWscf GPU version.

Although compilation of the GPU implementation is fairly

straight-forward, we kindly suggest that users carefully read the

README.GPU that is included. The intrinsic characteristics of hybrid

multi- and many-core systems require careful consideration to best

exploit the available computing power.

Any and all suggestions are welcome and will be very much appreciated.

Ivan Girotto &amp; Filippo Spiga

---

ICHEC GPU developer team

The Tower - 7th floor

Trinity Technology&amp;  Enterprise Campus

Grand Canal Quay - Dublin 2 - Ireland

+353-1-5241608 (ph) / +353-1-7645845 (fax)

</pre>

      </blockquote>

      <pre wrap="">_______________________________________________

Pw_forum mailing list

<a class="moz-txt-link-abbreviated" href="mailto:Pw_forum@pwscf.org">Pw_forum@pwscf.org</a>

<a class="moz-txt-link-freetext" href="http://www.democritos.it/mailman/listinfo/pw_forum">http://www.democritos.it/mailman/listinfo/pw_forum</a>

</pre>

    </blockquote>

    <br>

    <pre class="moz-signature" cols="72">-- 

Girotto Ivan  - <a class="moz-txt-link-abbreviated" href="mailto:ivan.girotto@ichec.ie">ivan.girotto@ichec.ie</a>

ICHEC - Computational Group - <a class="moz-txt-link-freetext" href="http://www.ichec.ie">http://www.ichec.ie</a>

The Tower - 7th floor

Trinity Technology &amp; Enterprise Campus

Grand Canal Quay - Dublin 2 - Ireland

+353-1-5241608 ex. 32 (ph) / +353-1-7645845 (fax) 

</pre>

  </body>

</html>