[llvm-commits] [llvm] r122801 - /llvm/trunk/lib/Transforms/Scalar/CodeGenPrepare.cpp

Mon Jan 3 21:42:39 PST 2011

On Jan 3, 2011, at 9:29 PM, Jakob Stoklund Olesen wrote:

> On Jan 3, 2011, at 8:43 PM, Cameron Zwarich wrote:
> 
>> Author: zwarich
>> Date: Mon Jan  3 22:43:31 2011
>> New Revision: 122801
>> 
>> URL: http://llvm.org/viewvc/llvm-project?rev=122801&view=rev
>> Log:
>> Avoid finding loop back edges when we are not splitting critical edges in
>> CodeGenPrepare (which is the default behavior).
> 
> Thanks, Cameron.
> 
> I noticed that there are a number of local DenseMap instances as well. It may be worthwhile to promote them to class members to avoid repeated allocations.
> 
>  DenseMap<Value*, Value*> SunkAddrs;
>  DenseMap<BasicBlock*, Instruction*> InsertedTruncs;
> 
> These two are in static functions that would have to be promoted to methods:
> 
>  DenseMap<BasicBlock*, CastInst*> InsertedCasts;
>  DenseMap<BasicBlock*, CmpInst*> InsertedCmps;
> 
> It seems like a good idea to avoid allocating and freeing a DenseMap for every bitcast and cmp instruction. 403.gcc has 157000 of those.

Good idea. I'll test that now. It's a bit annoying that you can't rely on RAII with all of these early returns, but I guess we could have a little .clear() helper RAII object. ;-)

> Is the fix-point loop in CodeGenPrepare still necessary? When critical edge splitting is disabled?

I've just been running some experiments on this. The fixed point loop is probably necessary for the 'ext' optimizations, as a lot of 'ext' casts get optimized after other instructions have been sunk into their block. On all of test-suite + SPEC2000 & SPEC2006, there are only 4 noop copies optimized in a later iteration (these don't really matter as they will be eliminated by the coalescer later), but there are 15 memory instructions that have their addressing code sunk into their BB in a later iteration. I was thinking of just iterating the ext optimizations afterwards, possibly based on a worklist, but it would be nice to know why these memory instructions have sinkable addressing code after the first iteration.

As an aside, I tried adding another pass of CFG optimizations (which is probably not there because it would reverse the critical edge splitting), and it merges a decent number of extra blocks.

Cameron