[LLVMdev] Loop unrolling opportunity in SPEC's libquantum with profile info
chandlerc at google.com
Fri Jan 31 13:28:53 PST 2014
I've completed some pretty thorough benchmarking and wanted to share the
On Mon, Jan 27, 2014 at 5:22 PM, Arnold Schwaighofer <
aschwaighofer at apple.com> wrote:
> Furthermore, I added a heuristic to unroll until load/store ports are
> saturated “-mllvm enable-loadstore-runtime-unroll” instead of the pure size
> based heuristic.
> Those two together with a patch that slightly changes the register
> heuristic and libquantum’s three hot loops will unroll and goodness will
> ensue (at least for libquantum).
Both enabling loadstore runtime unrolling and the register heuristic
(enabled with -enable-ind-var-reg-heur) show no interesting regressions
(way below the noise) and a few nice benefits across all of the
applications I measure. I'd support enabling them right away and getting
more feedback from others. I've measured on both westmere and sandybridge,
with -march=x86-64 and -march=corei7-avx.
I don't have any ARM hardware to benchmark with, but I suspect you have
decent numbers there? We also have a nice LNT bot that will measure
anything we enable for ARM.
Finally, I've got some experimental results for x86 that show some
improvements and no significant regressions when I increase several target
thresholds. I'll start a new thread about that though.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev