[LLVMdev] Landing my new development on the trunk ...

Thu Oct 28 10:56:25 PDT 2010

On Thu, Oct 28, 2010 at 9:38 AM, Brian West <bnwest at rice.edu> wrote:
>> 3. LLVM already has a significant amount of infrastructure for loop
>> passes; why does this pass have its own code for finding loops?
>
> I saw the loop infrastructure for CFG loops. This algorithm finds loops in
> the data flow (more precisely: strongly-connected components in the
> SSA-graph), e.g.
>
> bb5:
> %20 = add nsw i32 %i.0, 1
> br label %bb6
>
> bb6:
> %i.0 = phi i32 [ 0, %entry ], [ %20, %bb5 ]
>
> There is a data flow loop between %20 and %i.0. The OSR paper has a nice
> figure showing data flow loops.
>
> Here is small excerpt from the OSR paper:
>
> "OSR is driven by a simple depth-first search of the SSA-graph, using
> Tarjan’s strongly-connected component finder. The SCC-finder lets OSR
> discover induction variables in topological order and process them as they
> are discovered. As the SCCs are discovered, they are processed by a set of
> mutually-recursive routines that test for induction variables and region
> constants, and then reduce the appropriate operations."

Ah, okay, that makes sense.  I didn't really read the algorithm too carefully...

>> 4. What's the use case for this pass?  Can this pass improve
>> performance of generated code beyond the existing -loop-reduce pass?
>> If not, could it usefully act as a faster replacement for -loop-reduce
>> for fast code generation? (Perhaps someone doing fast code generation
>> with LLVM could comment on whether -loop-reduce is useful, and whether
>> the performance is an issue.)
[...]
> For array accesses, OSR and LSR produces similar LLVM IR and I am guessing
> that their execution times should be similar.

For normal compilation, LSR runs rather late (with the code generation
passes), and has some target-specific knowledge.  We really need
benchmarks here, I think.

> Empirically the OSR optimization is compile-time faster than LSR. I have
> also noticed that OSR has more "analysis" requirements: Induction Variable
> User, Natural Loop Information, Canonicalize natural loops, and Scalar
> Evolution Analysis. Both OSR and LSR require the Dominator Tree Construction
> analysis pass.

My primary concern here is that if this code gets committed without
anyone interested in actually using it, it will just end up orphaned,
so there's no point to contributing it.

> I did not mention in the original email (and should have) that OSR needs
> -instcombine to be run after it for cleanup. Also -licm, -reassociate, -gvn
> and -sccp can be enabling optimizations for OSR.

Hmm... perhaps that could be partially fixed using the InstSimplify
infrastructure.

-Eli