[PATCH] [AArch64] Enable partial unrolling and runtime unrolling for AArch64 target

Mon Oct 6 03:27:43 PDT 2014

Hi,

Following Chandler and Jiangning's advices, I did more experiments to find
out a more accurate number to be used as loop buffer size on A57. All
experiments are based on Dave's Cortex-A57 Machine Model update, and the
experimented numbers are 12, 14, 16, 18, 20, 22, 24.

>From the result we can see that, when loop buffer size is 16, all
benchmarks got or close to the lowest execution time among all tries, which
brings about 0.5% performance improvement on eembc, spec2000 and spec2006,
and the code bloat is about 1.5% in geomean and 7% at worst
case respectively.

Thanks,
Kevin

2014-10-06 11:09 GMT+01:00 Kevin Qin <kevinqindev at gmail.com>:

> Hi,
>
> After some performance investigation on smaller granular, I proposed to
> set 16 as loop buffer size for A57. I will post some experiments result in
> following comments. Thanks Chandler and Jiangning for their advices.
>
> Cheers,
> Kevin
>
> http://reviews.llvm.org/D5148
>
> Files:
>   lib/Target/AArch64/AArch64SchedA57.td
>   lib/Target/AArch64/AArch64TargetTransformInfo.cpp
>

-- 
Best Regards,

Kevin Qin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141006/2c3607dd/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 1_ExecutionTime_LoopBufferSize.png
Type: image/png
Size: 25291 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141006/2c3607dd/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 2_CodeBloat_LoopBufferSize.png
Type: image/png
Size: 17824 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141006/2c3607dd/attachment-0001.png>