[PATCH] [AArch64] Enable partial unrolling and runtime unrolling for AArch64 target

Mon Oct 6 04:13:08 PDT 2014

On 6 October 2014 11:27, Kevin Qin <kevinqindev at gmail.com> wrote:
> From the result we can see that, when loop buffer size is 16, all benchmarks
> got or close to the lowest execution time among all tries, which brings
> about 0.5% performance improvement on eembc, spec2000 and spec2006, and the
> code bloat is about 1.5% in geomean and 7% at worst case respectively.

Hi Kevin,

I agree 16 is the best heuristics.

cheers,
--renato