[PLUG] Nvidia Geforce GPU presentation

Jason Martin nsxfreddy at gmail.com
Sun Sep 23 06:36:25 UTC 2007


On 9/22/07, Keith Lofstrom <keithl at kl-ic.com> wrote:
> On Sat, Sep 22, 2007 at 06:31:14AM -0700, Michael Rasmussen wrote:
> > Keith Lofstrom wrote:
> > > It was an amazing talk.  The speaker from Nvidia talked about
> > > the hardware (Ge8800, 128 processing elements, 12,288 simultaneous
> > > threads, 500 Gflops, 170 watts at full speed)
> >
> > How does 128 processing elements translate or compare to the cores we know of
> > in our primary processors.
>
> IIRC,  they are a floating point unit, integer unit, and transcendental
> unit (sin/cos/exp/sqrt), very heavily pipelined, running in an SIMD
> (Single Instruction Multiple Data) group of 16 processing elements.
> No branching, much less any speculative execution, at the group level.
> Many data sets are in the pipeline at once; at the system level, it
> looks like hundreds of separate pipes.  Higher level organization of
> data, branching, etc. is mostly handled by the main CPU.  The data
> flow between the units is "fast divided by many" so the effective
> clock rate of the threads is in the 10s to 100s of MHz.
>
> Obviously, if the algorithm is inherently sequential, or sparse (like
> some kinds of FFT, or some circuit simulators), then this very heavily
> parallel computation unit will not be used efficiently.  But most things
> aren't that sequential or sparse, and for processing that will map to a
> grid (graphics rendering is obvious, but real time fluid dynamics and
> medical tomography were two spectacular animations shown).  In
> semiconductors, there is a computation process called "Resolution
> Enhancement Technology" which does complex optical modelling and
> optimization on terapixel grids, and this will reduce compute time
> enormously, making silicon photolithography cheaper.
>
> The most amazing thing to me is that the Earth Computer is 40 teraflops,
> and 80 of these devices (for $25K) could do much the same thing.  Or 24
> will do the work of Peter Jacksons two WETA computers (12Tflops) that
> rendered Lord of the Rings.  And Nvidia claims to be doubling the
> performance per dollar (and per watt) every year.   In October, Nvidia
> will release Tesla, 1U rack box with 4 of the Nvidia processors in it;
> at 2Teraflop each, one of these racks moves its owner briefly into the
> top 500 supercomputer list.
>
> This is going to change things ...

Yep, things are changing.  I'm a bit biased (since I work there), but
if you haven't already seen Intel's 80-core demo, 1Tflop at 40 watts,
then check it out here:

http://youtube.com/watch?v=687XqRq_fBA

One of the main challenges now is how to effectively *use* that
computing power in software in a way that benefits the typical end
user.

Jason



More information about the PLUG mailing list