[llvm-dev] (RFC) Adjusting default loop fully unroll threshold

Mon Feb 13 14:17:46 PST 2017

On Mon, Feb 13, 2017 at 2:06 PM Gerolf Hoflehner via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> For unrolling specifically I agree with Hal that the hooks should be
> target specific. Actually, I go further and think they should be uArch
> specific.
>

They already are, it is just that no one has contributed a patch to use
this on x86 microarchitectures.

Until someone shows up with data showing that we need different tunings for
different microarchitectures, it doesn't make sense for us to just make up
numbers there.

On the (very limited) microarchitectures we have and can test on, we're not
seeing a need for microarchitectural tuning. But if others have different
data, that would of course be welcome. That's part of what we're looking
for in this thread.

> I have no data or prove but would not be surprised to see a wider variety
> of numbers when the thresholds are tested on a wide range of x86 machines.
>

Until we have data, I don't see how we can act on this though.

> My first thought also was along the lines of Matthias: do it at a higher
> opt level e.g. O3 or possibly revisit/start thinking about O4.
>

Why? What about the data presented means that this isn't appropriate at O2?
I'm fine if that's the answer, but I think we need to have a clear and
unambiguous rationale behind it. With the current data on this thread, the
code size and compile time impact seem *very small* except for very small
benchmarks, many of which actually show the performance improvement as well.

>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170213/aebc122d/attachment.html>