[llvm-dev] (RFC) Adjusting default loop fully unroll threshold
Gerolf Hoflehner via llvm-dev
llvm-dev at lists.llvm.org
Tue Feb 14 12:53:52 PST 2017
> On Feb 13, 2017, at 2:17 PM, Chandler Carruth <chandlerc at gmail.com> wrote:
> On Mon, Feb 13, 2017 at 2:06 PM Gerolf Hoflehner via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
> For unrolling specifically I agree with Hal that the hooks should be target specific. Actually, I go further and think they should be uArch specific.
> They already are, it is just that no one has contributed a patch to use this on x86 microarchitectures.
> Until someone shows up with data showing that we need different tunings for different microarchitectures, it doesn't make sense for us to just make up numbers there.
> On the (very limited) microarchitectures we have and can test on, we're not seeing a need for microarchitectural tuning. But if others have different data, that would of course be welcome. That's part of what we're looking for in this thread.
> I have no data or prove but would not be surprised to see a wider variety of numbers when the thresholds are tested on a wide range of x86 machines.
> Until we have data, I don't see how we can act on this though.
> My first thought also was along the lines of Matthias: do it at a higher opt level e.g. O3 or possibly revisit/start thinking about O4.
> Why? What about the data presented means that this isn't appropriate at O2? I'm fine if that's the answer, but I think we need to have a clear and unambiguous rationale behind it. With the current data on this thread, the code size and compile time impact seem *very small* except for very small benchmarks, many of which actually show the performance improvement as well.
If there is clear insight where the gains are coming from O2 is fine. IMHO if we just have the “better” numbers we should go for a higher opt level since not everyone will benefit. Some users will only pay higher compile-times.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev