[llvm] r179957 - SimplifyCFG: If convert single conditional stores

Wed Apr 24 11:28:26 PDT 2013

On 4/24/13 10:56 AM, Chris Lattner wrote:
> Sorry for chiming in late on this thread.
>
> MHO is that this is still better to do on IR than in codegen.  Of 
> course, I'm also of the opinion that we should take Chandler's 
> improvement as well (properly done, its been long enough ago that I 
> forget the specific objections) because making the compiler better 
> shouldn't wait for some theoretical improvement that never happens. 
>  More below.
>
> On Apr 21, 2013, at 1:21 AM, Andrew Trick <atrick at apple.com 
> <mailto:atrick at apple.com>> wrote:
>> This is different than Chandler's case, because we know the MI 
>> "early" if-converter doesn't currently handle Arnold's optimization. 
>> Also, Arnold has not yet proposed any target level heuristics that 
>> attempt to predict cpu behavior, which was the main objection.
>
> This specific transformation (if converting stores) has two goals: 1) 
> improve micro architectural performance characteristics (less branch 
> prediction etc), and 2) enable mid-level optimizations to remove loads 
> and stores.
>
> In my mental cost model, eliminating loads and stores is always 
> goodness: it is generally always good for performance, and it unblocks 
> other secondary optimizations.  They are a great canonicalization.
>
> Because this is a canonicalization of this sort, it seems clearly good 
> to do on IR, and early.  Doing something like this at the codegen 
> level specifically for micro-architectural reasons could also make 
> sense, but I don't see that eliminating the usefulness of doing it 
> early as well.
Introducing a "select" at IR level dose not necessarily means CodeGen 
convert the "select" with predicated instruction like cmov.
cmov is not necessary inexpensive, for example, on Pentium 4, the 
latency of cmov is about 10+ cycle.
On this platform, If the compiler blindly convert a well predictable 
branch to cmov on this platform, it only see degradation.

That said, I think it makes some sense to perform force-if-cvt at IR 
level if the algorithm rely on straight line code.

>
>> That said, we should have a reason to if-convert before lowering 
>> other than optimizing for a machine's cpu pipeline.
>>
>> Are we all convinced that if-converting a single store is the proper 
>> canonical form?
>
> I am, at least in this specific benchmark's case.  You *can't* legally 
> do the if conversion if you are introducing a memory access that 
> otherwise would not have done.  Doing this can have lots of semantic 
> effects.  In order for this to be *legal* at all (ignoring 
> profitability) you have to prove that a subsequent store is happening 
> to the memory location.
>
> In this case, the *profitability* comes down to being able to 
> obviously, locally, eliminate a load from the address.  It's true that 
> GVN/PRE can eliminate part of the load in principle, but in practice 
> this doesn't happen.
GVN get rid of all the loads for the cases this if-cvt is trying to catch.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130424/cc6fef56/attachment.html>