[PATCH] [AArch64] Enable partial unrolling and runtime unrolling for AArch64 target

Thu Sep 25 23:18:33 PDT 2014

Kevin,

Thanks for collecting such a lot of data.

I think you should try to collect more data by combining your patch and
Dave's Cortex-A57 Machine Model update.

Increasing loop unrolling count potentially will cause more calculations in
parallel within the loop and introduce more data dependence as well. And
the more accurate we can model the latency of Cortex-a57 instructions, the
more performance gain we could obtain. Otherwise, I don't think your
current measurement is reasonable enough, now that this is still a
heuristic based method for spec2006 benchmark.

And I'm hoping you can get more performance gain with the new cortex-a57
machine model.

Thanks,
-Jiangning

2014-09-25 19:55 GMT+08:00 Renato Golin <renato.golin at linaro.org>:

> On 25 September 2014 11:49, Chandler Carruth <chandlerc at google.com> wrote:
> > Second question: can you try to drill down? Specifically, what about 16?
> 18?
> > 22? 24? It would be useful to essentially try to refine the precision of
> the
> > curve near to current hypothesized good threshold.
>
> +1
>
> --renato
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140926/23495971/attachment.html>