[PATCH] Improve the cost evaluation of LSR

Mon May 4 13:49:56 PDT 2015

Thanks Sanjay for the testing. I havn't considered the difference
between different microarchitectures, which is needed to be considered
in the future. For the patch, I regarded all x86 architecture as the
same and evaluated the cost mainly using inst numbers.

It is good to know for Jaguar the patch got some performance
improvement. And I am especially interested at the cause of the
regression. I will try the benchmark and see if there is any
instruction increase there.

Wei.

On Mon, May 4, 2015 at 1:10 PM, Sanjay Patel <spatel at rotateright.com> wrote:
> Something to consider when evaluating changes in this area: I see no perf difference between the existing and proposed codegen for the loops seen in https://llvm.org/bugs/show_bug.cgi?id=23384#c0 when running on an Intel Haswell system, but there is a >10% win running on an AMD Jaguar system when using the complex addressing. The Jaguar core is narrower (2-wide issue) and has much less micro-architectural shock absorption than Haswell. Wins and losses may be more apparent on smaller cores such as this or Atom.
>
> FWIW, I applied the patch, ran the benchmarking subset of test-suite on the Jaguar system, and saw a 0.7% geomean improvement after filtering out lots of noisy tests. Benchmarks/McGill/chomp is the largest improvement: +31%; Benchmarks/Stanford/FloatMM is the worst regression: -10%. I haven't analyzed the results any more than that yet.
>
>
> REPOSITORY
>   rL LLVM
>
> http://reviews.llvm.org/D9429
>
> EMAIL PREFERENCES
>   http://reviews.llvm.org/settings/panel/emailpreferences/
>
>