<div dir="ltr">Hi Arnold, Gerolf, Hal,<div><br></div><div>Good idea about a tuning option. Attached is a patch that implements that tuning option. Is it OK to commit?</div><div><br></div><div>Gerolf, did you want me to add "|| isCyclone()" to AArch64TargetTransformInfo?</div>

<div><br></div><div>Cheers,</div><div><br></div><div>James</div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On 11 August 2014 04:31, Gerolf Hoflehner <span dir="ltr"><<a href="mailto:ghoflehner@apple.com" target="_blank">ghoflehner@apple.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On Typhoon the gain for libquantum is almost 2%, and about one percent on hmmer. No regression on CINT2006. This is O3 LTO, ref input.<br>


<span class="HOEnZb"><font color="#888888"><br>

-Gerolf<br>

</font></span><div class="HOEnZb"><div class="h5"><br>

On Aug 8, 2014, at 4:57 PM, Gerolf Hoflehner <<a href="mailto:ghoflehner@apple.com">ghoflehner@apple.com</a>> wrote:<br>

<br>

> I second a tuning option at least in the short term. It is usually hard to get it right, though. So longer term this is a case for dynamic versioning that invokes different versions of the code at run-time depending on the trip count.<br>


><br>

> -Gerolf<br>

><br>

> On Aug 8, 2014, at 12:30 PM, Hal Finkel <<a href="mailto:hfinkel@anl.gov">hfinkel@anl.gov</a>> wrote:<br>

><br>

>> ----- Original Message -----<br>

>>> From: "James Molloy" <<a href="mailto:james.molloy@arm.com">james.molloy@arm.com</a>><br>

>>> To: "Arnold Schwaighofer" <<a href="mailto:aschwaighofer@apple.com">aschwaighofer@apple.com</a>><br>

>>> Cc: "llvm-commits" <<a href="mailto:llvm-commits@cs.uiuc.edu">llvm-commits@cs.uiuc.edu</a>><br>

>>> Sent: Friday, August 8, 2014 9:37:38 AM<br>

>>> Subject: [PATCH][LoopVectorizer] Restrict the unroll factor of reductions in        loops<br>

>>><br>

>>><br>

>>><br>

>>><br>

>>><br>

>>> Hi Arnold,<br>

>>><br>

>>><br>

>>><br>

>>> Attached are two patches. The first ups the maximum unroll factor on<br>

>>> AArch64 from 2 to 4, for C-A57 only at the moment as that’s all I’ve<br>

>>> got data for. This gives us significant wins – ~14% on<br>

>>> 462.libquantum at least.<br>

>>><br>

>>><br>

>>><br>

>>> However it also causes some regressions. The second patch makes the<br>

>>> loop vectorizer a bit more conservative with its unroll factor. The<br>

>>> problem is purely for reductions within loops. The regressions I’ve<br>

>>> seen are small (but runtime-known) trip count loops within a loop<br>

>>> nest. A loop unroll factor of 2 is fine, but above 2 the reduction<br>

>>> variable fixup logic after the loop increases the critical path<br>

>>> length and resource usage. For most loops this isn’t a problem, but<br>

>>> small loops in a larger loop nest will execute this fixup code many<br>

>>> times.<br>

>><br>

>> Can you please add a flag for this? I anticipate needing to tune it.<br>

>><br>

>> Also, it seems to me that this is exactly the kind of thing that would benefit from profiling information (so we can determine if the inner loop is likely to have a large trip count). Can the current infrastructure do this? Also, maybe in cases where the inner loop count is not a function of the outer loop, we might 'unswitch' it so that we get the unrolled inner loop only when actually profitable.<br>


>><br>

>> Thanks again,<br>

>> Hal<br>

>><br>

>>><br>

>>><br>

>>><br>

>>> The heuristic is: if this is a (scalar) reduction, and the loop is<br>

>>> nested, clamp the UF to a maximum of 2. With 2, we still get wins<br>

>>> but we only add one fadd/fmul to the critical path.<br>

>>><br>

>>><br>

>>><br>

>>> Please take a look.<br>

>>><br>

>>><br>

>>><br>

>>> Cheers,<br>

>>><br>

>>><br>

>>><br>

>>> James<br>

>>> _______________________________________________<br>

>>> llvm-commits mailing list<br>

>>> <a href="mailto:llvm-commits@cs.uiuc.edu">llvm-commits@cs.uiuc.edu</a><br>

>>> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>

>>><br>

>><br>

>> --<br>

>> Hal Finkel<br>

>> Assistant Computational Scientist<br>

>> Leadership Computing Facility<br>

>> Argonne National Laboratory<br>

>><br>

>> _______________________________________________<br>

>> llvm-commits mailing list<br>

>> <a href="mailto:llvm-commits@cs.uiuc.edu">llvm-commits@cs.uiuc.edu</a><br>

>> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>

><br>

<br>

<br>

_______________________________________________<br>

llvm-commits mailing list<br>

<a href="mailto:llvm-commits@cs.uiuc.edu">llvm-commits@cs.uiuc.edu</a><br>

<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>

</div></div></blockquote></div><br></div>