[LLVMdev] mem2reg optimization

Mon Oct 6 22:58:33 PDT 2008

Hi Chris,

Sounds like an excellent generic approach. I'd love to lend a hand to
implement it but I'm afraid it involves a little more in-depth LLVM
knowledge than what I currently master, so I'll leave it to Dave and you for
now...

Talking about performance, for me this optimization was the difference
between mem2reg taking 63% of execution time, and having it totally
'vanish'. It also marked the transition between rather using a custom JIT
compiler and getting really excited to use LLVM instead. :) Previously I
just assumed LLVM to be inherently slow but now I realize there's still
potential for optimization.

Do you consider it useful if I go ahead and further identify bottlenecks and
report them or would you rather concentrate on correctness bugs for now? Is
there any list of known optimization opportunities?

Thanks,

Nicolas

-----Original Message-----
From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On
Behalf Of Chris Lattner
Sent: Sunday, 05 October, 2008 00:05
To: LLVM Developers Mailing List
Subject: Re: [LLVMdev] mem2reg optimization

On Oct 4, 2008, at 2:51 PM, Chris Lattner wrote:
>>> I like your approach of using the use lists but I'm not sure the
>>> ordering
>>> is guaranteed.  If it is, your approach is superior.
>>
>> I got my patch updated to work with TOT.  Here it is.  Comments
>> welcome.
>
> Hi Dave,
>
> Great.  I'd like to get this in, but would really like to do a profile
> first.  Could you please include a testcase?  Even if it is something
> contrived with llvm-gcc, I would really like to understand what is
> going on.

Looking more closely at the approach, I infer that the (logical in  
hindsight) issue are cases where mem2reg scans a BB to see the  
relative ordering of two instructions.  For example, the single store  
case.  I could imagine that if you saw lots of single store allocas in  
the middle of a large BB that mem2reg would be very very slow.

As far as the approach goes, here's a counterproposal that is a bit  
less complex.  I suggest adding a single DenseMap<Instruction*,  
unsigned> to Mem2reg.  If there is an entry in this map for a load/ 
store instruction, it would store the instruction's index in its basic  
block.  This means that to check dominance of two instrs in a block,  
you just need to compare their two indices (which is fast).  Really  
small blocks (< 10 instructions?) would never be entered into map, to  
keep the size of the map low.  The first time a query is made within a  
large block, it would compute the indices of all loads and stores  
within the block.

The nice thing about using a single map is that it makes updating the  
map really trivial.  Just use TheMap.erase(I) when deleting load and  
store instructions.  The numbering can be purely local to a block and  
the relative ordering of memops don't get invalidated when a load or  
store is nuked.

Does this seem like a reasonable approach?  If so, please split the  
various parts of mem2reg out into helper methods as a prepatch.  This  
will make review of the subsequent changes much easier.

-Chris

_______________________________________________
LLVM Developers mailing list
LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev