[LLVMdev] Area for improvement

Mon Feb 21 23:18:10 PST 2005

On Mon, 21 Feb 2005, Jeff Cohen wrote:
>>> trying out the really big array.  You're right, it does clean it up except 
>>> for the multiplication.
>>> 
>>> So LoopStrengthReduce is not ready for prime time and doesn't actually get 
>>> used?
>> 
>> 
>> I don't know what the status of it is.  You could try it out, and see what 
>> it does. :)  I would suggest adding it to gccas immediately after the SCCP 
>> pass.
>
> I know it has a lot of FIXMEs in it.

Okay, fair enough.  I think these are due to unimplemented features.  I 
think that Nate (wisely) started simple, and decided to handle the complex 
cases later.

>>> I might consider whipping it into shape.  Does it still have to handle 
>>> getelementptr in its full generality?
>> 
>> 
>> What do you mean?
>> 
>> -Chris
>
> Consider the bytecode I originally posted for the nested for loop.  If the 
> getelementptrs haven't been replaced with something simpler, LSR will have to 
> take them apart itself.  The multiplication cannot be replaced with addition 
> without the two-dimensional access being replaced with one-dimensional access 
> plus some pointer arithmetic.  I would guess that they will not have been 
> decomposed before LSR is run, correct?

No, this is exactly what LSR should do in this case.

> It also appears the current LSR reduces each GEP in isolation of all the 
> others.  That clearly is bad for this example, as all the GEPs in this loop 
> do the same thing with the induction variable.  Creating a separate pointer 
> for each one only makes things worse.  Ordinarily the addressing is explicit 
> so the redundancy would have been removed and the LSR algorithm doesn't have 
> to worry about it, but here it does.  Bottom line is it has to strength 
> reduce operations that are implicit, so it first has to determine what those 
> operations are and detect common subexpressions among those operations itself 
> (possibly across multiple basic blocks).

Yes, if you mean something like this:

   for (i)
     for (j)
       A[i][j].X = 0;
       A[i][j].Y = 0;

It should be sufficient to reduce this to (just considering the inner 
loop):

   for (i)
     p = &A[i];   // this would be futher reduced later.
     for (j)
       p[j].X = 0;
       p[j].Y = 0;

(this needs to do the auto-cse of p into the two GEPs).  Further strength 
reduction should simplify this to:

   for (i)
     p = &A[i];   // this would be futher reduced later.
     for (j)
       p->X = 0;
       p->Y = 0;
       ++p;

... which is what we want.

> It doesn't appear to handle 
> explicit multiplications at all (probably doesn't add much value to do so).

True, this can be added later.  Other operations, including div and rem, 
can be strength reduced as well.

> It's a challenge :)

:)  I would suggest continuing Nate's approach.  Start with the most 
trivial of simple loops (e.g. for (i) A[i] = 0; ), and work up from there, 
adding complexity as necessary.

The idea should be to always have a pass that works, but continue to add 
new capabilities to it.

-Chris

-- 
http://nondot.org/sabre/
http://llvm.cs.uiuc.edu/