[LLVMdev] Landing my new development on the trunk ...

Fri Oct 29 01:31:51 PDT 2010

Eli is right. We do need to see some benchmark numbers and understand how the pass will fit in the target independent optimizer. While we encourage contribution, we typically don't commit new passes unless it introduce new functionalities that have active clients. It would also help if you provide us with compile time numbers.

Evan

On Oct 28, 2010, at 10:56 AM, Eli Friedman wrote:

> On Thu, Oct 28, 2010 at 9:38 AM, Brian West <bnwest at rice.edu> wrote:
>>> 3. LLVM already has a significant amount of infrastructure for loop
>>> passes; why does this pass have its own code for finding loops?
>> 
>> I saw the loop infrastructure for CFG loops. This algorithm finds loops in
>> the data flow (more precisely: strongly-connected components in the
>> SSA-graph), e.g.
>> 
>> bb5:
>> %20 = add nsw i32 %i.0, 1
>> br label %bb6
>> 
>> bb6:
>> %i.0 = phi i32 [ 0, %entry ], [ %20, %bb5 ]
>> 
>> There is a data flow loop between %20 and %i.0. The OSR paper has a nice
>> figure showing data flow loops.
>> 
>> Here is small excerpt from the OSR paper:
>> 
>> "OSR is driven by a simple depth-first search of the SSA-graph, using
>> Tarjan’s strongly-connected component finder. The SCC-finder lets OSR
>> discover induction variables in topological order and process them as they
>> are discovered. As the SCCs are discovered, they are processed by a set of
>> mutually-recursive routines that test for induction variables and region
>> constants, and then reduce the appropriate operations."
> 
> Ah, okay, that makes sense.  I didn't really read the algorithm too carefully...
> 
>>> 4. What's the use case for this pass?  Can this pass improve
>>> performance of generated code beyond the existing -loop-reduce pass?
>>> If not, could it usefully act as a faster replacement for -loop-reduce
>>> for fast code generation? (Perhaps someone doing fast code generation
>>> with LLVM could comment on whether -loop-reduce is useful, and whether
>>> the performance is an issue.)
> [...]
>> For array accesses, OSR and LSR produces similar LLVM IR and I am guessing
>> that their execution times should be similar.
> 
> For normal compilation, LSR runs rather late (with the code generation
> passes), and has some target-specific knowledge.  We really need
> benchmarks here, I think.
> 
>> Empirically the OSR optimization is compile-time faster than LSR. I have
>> also noticed that OSR has more "analysis" requirements: Induction Variable
>> User, Natural Loop Information, Canonicalize natural loops, and Scalar
>> Evolution Analysis. Both OSR and LSR require the Dominator Tree Construction
>> analysis pass.
> 
> My primary concern here is that if this code gets committed without
> anyone interested in actually using it, it will just end up orphaned,
> so there's no point to contributing it.
> 
>> I did not mention in the original email (and should have) that OSR needs
>> -instcombine to be run after it for cleanup. Also -licm, -reassociate, -gvn
>> and -sccp can be enabling optimizations for OSR.
> 
> Hmm... perhaps that could be partially fixed using the InstSimplify
> infrastructure.
> 
> -Eli
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev