[llvm-dev] [RFC] New pass: LoopExitValues

Mon Sep 14 14:18:48 PDT 2015

On Fri, Sep 11, 2015 at 10:06 AM, Hal Finkel <hfinkel at anl.gov> wrote:
> Hi Steve,
>
> I believe that others have been looking recently at a very similar set of problems. For example, Wei Mi posted this patch:
>
>   http://reviews.llvm.org/D12090
>
> and I wonder if this would also address your use cases. If not, I think we'd like to understand why. Also, if these redundant expressions involve induction variables, then that's something that the IndVarSimplify is already supposed to do, and if it is missing cases, then we should improve that pass (and, thus, folding what you've done into IndVarSimplify might be the right way to go).
>
>  -Hal

I tried the patch on the matrix_mul testcase and saw where the
redundency was removed. My patch in http://reviews.llvm.org/D12090
cannot handle the case. I feel like it is a different problem.

void matrix_mul(unsigned int Size, unsigned int *Dst, unsigned int
*Src, unsigned int Val) {
  for (int Outer = 0; Outer < Size; ++Outer)
    for (int Inner = 0; Inner < Size; ++Inner)
       Dst[Outer * Size + Inner] = Src[Outer * Size + Inner] * Val;
}

"Outer * Size + Inner" is an induction variable in inner loop. Its
SCEV is {{0,+,%Size}<outerloop>,+,1}<innerloop>, so the initial value
of the induction variable for inner loop is {0, +, %Size}<outerloop>,
.i.e, an "lsr.iv.next = lsr.iv + %Size" will be generated at the end
of every iteration of outerloop to prepare the initial value of the
induction variable in the next outerloop iteration.

However, the iteration number of inner loop happens to be "Size" too,
so the exit value of "Outer * Size + Inner" in inner loop happens to
be the same with the initial value of "Outer * Size + Inner" in the
next iteration of outerloop. So the exit value of the induction
variable can be reused and "lsr.iv.next = lsr.iv + %Size" which is
used to compute the initial value of "Outer * Size + Inner" can be
omitted.

If I changed Inner < Size to Inner < Size1, compiler will generate the
same code w/wo the patches.

I tried gcc and gcc also couldn't remove the redundency too. Looks
like the optimization doesn't fit current LSR very well because it
needs to consider the relationship between initial and exit value of
the induction variable in different iterations of outerloop.

This is an interesting work because the code pattern to implement a
multi-dimensional array as a single linear array is common.

Thanks,
Wei.