[PATCH] IndVarSimplify : do not recompute an IV value used outside of the loop if it is trivially known not to be beneficial

Tue Mar 5 07:30:09 PST 2013

On 03/05/2013 09:19 AM, Andrew Trick wrote:
> On Feb 28, 2013, at 2:15 AM, Arnaud A. de Grandmaison <arnaud.adegm at gmail.com> wrote:
>
>> Hi All,
>>
>> I believe the IndVarSimplify is a bit too aggressive when it computes
>> exit values. There are some cases where it is not beneficial to do so.
>> For example, in the code below :
>>
>> extern void func(unsigned val);
>>
>> void test(unsigned m)
>> {
>>  unsigned a = 0;
>>
>>  for (int i=0; i<186; i++) {
>>    a += m;
>>    func(a);
>>  }
>>
>>  func(a);
>> }
>>
>> Although the value of 'a' can be computed outside of the loop, there is
>> no benefit in recomputing it : it is already computed and used in the
>> loop in a way that can not be optimized away. When compiling the above
>> testcase  with clang on x86, we get an unnecessary multiply.
>>
>> Detecting (non) optimizable patterns is way to difficult, but we can
>> still grasp some low hanging fruits if we can trivially determine there
>> is no benefit.
>>
>> Comments ?
> Arnaud,
>
> Your patch itself looks fine but I'm surprised by the approach. I expected the primary benefit of replacing exit values to be exposing constants outside the current loop to later acyclic optimization like sccp and instcombine. Those constants may be used inside different inner loops.

Hi Andy,

Thanks for looking into this patch.

>From what I understand, the IndVarSImplify benefit is two fold :
 1) enable current loop simplification (up to deletion) by extracting
the constant computation
 2) enable other optimizations to kick-in outside the current loop
because the constant is exposed in a simpler / cannonical form.

If 1 and/or 2 holds, this is a benefit. But if none of them holds, this
just make the code less efficient because of the computation duplication.

> Why not always replace constant exit values?

It depends on the constant :)

Some are real constant (i.e. 40), and there should be close to no
penalty with them --- i need to fix my patch for this case.
For pseudo-constant (i.e 40 * %var), this can be different.

In the example code I gave, transforming :

unsigned a = 0;
for (int i=0; i<186; i++) {
   a += m;
   func(a);
}

 func(a);

To:

unsigned a = 0;
for (int i=0; i<186; i++) {
   a += m;
   func(a);
}

func(186*m);

is not a benefit. This can be seen in the generated assembly for x86 or
x86_64. On my target cpu, this is even worse, as the multiplication is
not even available, and the testcase is part of a bigger loop.

> Can we also look at the type of uses outside the loop. If they're all obviously nonoptimizable it's ok to skip exit value replacement (for non constants), otherwise we risk missing out on something. In this sort of situation I think it's also ok to just stop searching after 6 uses and just go ahead with the optimization.
Good point. I have only considered the current loop, but I should also
consider the uses of the replacement.

I was also thinking of adding a use limit, but had no opinion for its
value. 6 looks fine :)

We may also want to take into account the type of the pseudo-constant :
if it is not a native type, we should even be less aggressive.

>
> I would at least gather optimization stats across llvm test-suite before supressing exit value replacement to make sure we're not preventing other optimization.
I have never done that before, but that's a good time to start :)

Thanks for the comments,

>
> -Andy
>
>

-- 
Arnaud A. de Grandmaison