[PATCH] D13427: RFC: faster isa<IntrinsicInst> (bugged tests?)

Mon Oct 5 17:33:27 PDT 2015

silvas added a comment.

What is the resulting assembly code?
Also what processor are you measuring on?

I'm a bit wary of doing micro-optimizations like this with nothing but "seems faster on my machine" to go on. We probably want to identify the root cause (if only so that we can look for similar cases in our core classes).

Based on the information you've provided, the only difference on the scale of ~40 cycles that I can think of for this code is that following the pointer inside getName() is typically going to L3 (assuming a modern Intel core). If you simply duplicate the code inside START_TIMER and STOP_TIMER (and make sure the compiler doesn't de-duplicate it; e.g. pull it out into a noinline function), how much does the performance change?

Repository:
  rL LLVM

http://reviews.llvm.org/D13427