[llvm-commits] [llvm] r122801 - /llvm/trunk/lib/Transforms/Scalar/CodeGenPrepare.cpp

Mon Jan 3 22:10:19 PST 2011

On Jan 3, 2011, at 9:42 PM, Cameron Zwarich wrote:

> On Jan 3, 2011, at 9:29 PM, Jakob Stoklund Olesen wrote:
>> 
>> It seems like a good idea to avoid allocating and freeing a DenseMap for every bitcast and cmp instruction. 403.gcc has 157000 of those.
> 
> Good idea. I'll test that now. It's a bit annoying that you can't rely on RAII with all of these early returns, but I guess we could have a little .clear() helper RAII object. ;-)

Don't bother. Just clear the maps before using them instead of after. Nobody will notice.

Note that 3 of the maps can share a single DenseMap<BasicBlock*, Instruction*>.

>> Is the fix-point loop in CodeGenPrepare still necessary? When critical edge splitting is disabled?
> 
> I've just been running some experiments on this. The fixed point loop is probably necessary for the 'ext' optimizations, as a lot of 'ext' casts get optimized after other instructions have been sunk into their block. On all of test-suite + SPEC2000 & SPEC2006, there are only 4 noop copies optimized in a later iteration (these don't really matter as they will be eliminated by the coalescer later), but there are 15 memory instructions that have their addressing code sunk into their BB in a later iteration. I was thinking of just iterating the ext optimizations afterwards, possibly based on a worklist, but it would be nice to know why these memory instructions have sinkable addressing code after the first iteration.

It is probably chained bitcast / ext / gep instructions getting lowered one at a time.

If that is the case, you could probably get away with iterating over each basic block separately instead of re-checking the whole function. That assumes that the chains to be lowered already were in the same basic block. I have no idea if that is generally true.

It would be safer and faster to add the operands of lowered instructions to a work list, but that is a bit more work to implement.

> As an aside, I tried adding another pass of CFG optimizations (which is probably not there because it would reverse the critical edge splitting), and it merges a decent number of extra blocks.

That makes sense.

I see critical edge splitting during register allocation in our future ;-)

/jakob

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20110103/38b67038/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 1929 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20110103/38b67038/attachment.bin>