[LLVMdev] Area for improvement
Jeff Cohen
jeffc at jolt-lang.org
Mon Feb 21 23:25:55 PST 2005
Well, what about this?
for (i)
for (j)
A[i][j].X = 0;
if (func(j))
A[i][j].Y = 0;
else
A[i][j].Z = 0;
This is CSE across multiple basic blocks, i.e. GCSE.
Or even:
for (i)
for (j)
A[i][j].X = 0;
if (func(j))
A[i-1][j].Y = 0;
else
A[i+1][j].Z = 0;
Much easier if the GEPs have been decomposed first and the usual global
opts already performed :)
Chris Lattner wrote:
>> Consider the bytecode I originally posted for the nested for loop.
>> If the getelementptrs haven't been replaced with something simpler,
>> LSR will have to take them apart itself. The multiplication cannot
>> be replaced with addition without the two-dimensional access being
>> replaced with one-dimensional access plus some pointer arithmetic. I
>> would guess that they will not have been decomposed before LSR is
>> run, correct?
>
>
> No, this is exactly what LSR should do in this case.
>
>> It also appears the current LSR reduces each GEP in isolation of all
>> the others. That clearly is bad for this example, as all the GEPs in
>> this loop do the same thing with the induction variable. Creating a
>> separate pointer for each one only makes things worse. Ordinarily
>> the addressing is explicit so the redundancy would have been removed
>> and the LSR algorithm doesn't have to worry about it, but here it
>> does. Bottom line is it has to strength reduce operations that are
>> implicit, so it first has to determine what those operations are and
>> detect common subexpressions among those operations itself (possibly
>> across multiple basic blocks).
>
>
> Yes, if you mean something like this:
>
> for (i)
> for (j)
> A[i][j].X = 0;
> A[i][j].Y = 0;
>
> It should be sufficient to reduce this to (just considering the inner
> loop):
>
> for (i)
> p = &A[i]; // this would be futher reduced later.
> for (j)
> p[j].X = 0;
> p[j].Y = 0;
>
> (this needs to do the auto-cse of p into the two GEPs). Further
> strength reduction should simplify this to:
>
> for (i)
> p = &A[i]; // this would be futher reduced later.
> for (j)
> p->X = 0;
> p->Y = 0;
> ++p;
>
> ... which is what we want.
>
>> It doesn't appear to handle explicit multiplications at all (probably
>> doesn't add much value to do so).
>
>
> True, this can be added later. Other operations, including div and
> rem, can be strength reduced as well.
>
>> It's a challenge :)
>
>
> :) I would suggest continuing Nate's approach. Start with the most
> trivial of simple loops (e.g. for (i) A[i] = 0; ), and work up from
> there, adding complexity as necessary.
>
> The idea should be to always have a pass that works, but continue to
> add new capabilities to it.
>
> -Chris
>
More information about the llvm-dev
mailing list