[PLUG] surprising performance on my HNOW nodes

Russell Senior seniorr at aracnet.com
Tue Aug 10 15:26:01 UTC 2004


>>>>> "Elliott" == Elliott Mitchell <ehem at m5p.com> writes:

Elliott> [...] So left with what killed the P4? First thing I'll note
Elliott> is despite the resident size being 1370KB notice that is
Elliott> bigger than the cache of most processors. Notably if this is
Elliott> a late model P3 then it might have 512KB cache, if the P4 was
Elliott> an early model it might have a mere 256KB.

The P4 has 512 KB.  The P3 has only 256 KB!

Elliott> [...] If your code has a lot of irregular branches, this will
Elliott> *kill* the P4 (no modern processor likes branches, but none
Elliott> compare to the P4's dislike of them).

This might be it.  The main loop traverses a list of heterogeneous
objects.  Certainly there is branching to handle different subtypes,
and branching to decide whether computations are even needed in some
cases.  I'll have to go back and look that code and see if I might be
able to smooth it out.

Do you know which modern processors might be better?  

Does anyone want to volunteer their CPUs to be part of my HNOW?

Thanks!


-- 
Russell Senior         ``I have nine fingers; you have ten.''
seniorr at aracnet.com




More information about the PLUG mailing list