[LLVMdev] X86 FMA4
dag at cray.com
dag at cray.com
Mon Jul 30 11:12:15 PDT 2012
Michael Gottesman <mgottesman at apple.com> writes:
> But if you don't believe me, time the instructions yourself (its an
> important thing to have in your toolbox anyways since sometimes Intel's
> documentation can be non-specific). I have a small instruction timing
> project lying around somewhere, if you want it I can send it to you
> privately.
Note that timings aren't everything. Second-order effects can be
meaningful. I've seen it many times. I would not be surprised at all
if a scalar spill/restore is much faster than a vector one due to the
lower impact on cache bandwidth. We know that scalar spill/reload is
much better. We have seen it in many real codes.
Single instruction timings and small kernels can be very, very
misleading.
-Dave
More information about the llvm-dev
mailing list