[PATCH] D28368: Give higher full-unroll boosting when the loop iteration is small.

Tue Jan 10 16:51:06 PST 2017

danielcdh added a comment.

In https://reviews.llvm.org/D28368#641996, @mzolotukhin wrote:

> > Could you point us to the benchmarks you observed regression after boosting fully unroll threshold?
>
> I ran the standard LLVM testsuite in the past and I think I observed several runtime regressions. However my memory might play tricks on me, so if you've just remeasured it, you can ignore this.
>
> > What do you mean by "guaranteed benefit"?
>
> I meant that while some compiletime/runtime tradeoff might be acceptable, we definitely need to be aware of it before we land such changes, and we do have to have numbers at hand for that. Ideally, that would be pure win (runtime performance improves, compile time and code size is the same), but yeah, unfortunately it's unfeasible.

Definitely, the result I have so far includes speccpu and llvm testsuite. We also have internal benchmarks with all positive performance impact and < 0.1% mean size increase, unfortunately we cannot show them here. Let me know if there's any other benchmarks you would like me to test for perf/code_size impacts.

> 
> 
>> update the data to remove the trip count logic and merely boost the fully unroll tripcount by 2X
> 
> What option do you mean by fully unroll tripcount threshold? `-unroll-threshold`?

I should have upload the updated patch to make it clear, sorry about that. The above numbers are collected with the updated patch.

Thanks,
Dehao

https://reviews.llvm.org/D28368