[LLVMdev] generate llvm.assume calls in GVN?

Fri Jan 16 10:39:20 PST 2015

The general problem of exploiting path sensitive facts for optimization 
is a very open one.  I've been thinking about this myself a bit 
recently, but haven't come up with a particularly workable approach 
yet.  The major problem is finding the right trade off between compile 
time and optimization quality.

One simplification I think I've talked myself into is completely 
ignoring merging of conditions (i.e. dataflow) and simply rely on 
dominating conditions.  At least so far, this seems to handle most of 
the cases I care about.  (I'm mostly looking at cases involving either 
null checks or range checks.)

I've more or less convinced myself that any approach taken would have to 
be lazy, and make aggressive use of analysis caching.  The only 
exception to this might be a early dominance based early propagation 
restricted to specific narrow cases.  (Either an extension to EarlyCSE, 
or something similar.)

In an ideal world, the analysis results would feed all other 
optimization passes (in particular, InstSimplify and InstCombine). Not 
sure if this is doable for version one though.

It's also worth mentioning that a between LVI, GVN, & LoopUnswitch, we 
actually do get a lot of conditions inside loops. In particular, any 
check inside a loop *header* block with an immediately dominating check 
outside the loop is generally taken care of.

Philip

On 01/16/2015 08:46 AM, Sanjay Patel wrote:
> Thanks, LLVM Ol' Timers! Your ghost stories are scary enough that I'll 
> tread quietly past predsimplify's grave.
>
> I was peeking into this because - and I'm happy to be corrected if 
> I've misdiagnosed these - we've seen a few VRP-related bugs cropping 
> up lately:
> 1. 
> http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-November/079038.html 
> (remove alignment checks)
> 2. 
> http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-December/079345.html 
> (remove tag bit tests)
> 3. http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-January/080314.html 
> (remove FP range checks, patch submitted)
> 4. http://llvm.org/bugs/show_bug.cgi?id=17683 (range check to improve 
> division)
> 5. http://llvm.org/bugs/show_bug.cgi?id=17713 (propagate FP 
> equalities, patch committed)
>
>
> On Thu, Jan 15, 2015 at 3:42 PM, Nick Lewycky <nlewycky at google.com 
> <mailto:nlewycky at google.com>> wrote:
>
>     On 15 January 2015 at 10:49, Chandler Carruth
>     <chandlerc at google.com <mailto:chandlerc at google.com>> wrote:
>
>         On Thu, Jan 15, 2015 at 10:30 AM, Sanjay Patel
>         <spatel at rotateright.com <mailto:spatel at rotateright.com>> wrote:
>
>             Would it be wrong to generate the llvm.assume IR suggested
>             below? in GVN?
>
>
>         I think so... Because:
>
>                 One very small tweak you could make would be to add an
>                 llvm.assume inside the if case of the if condition. 
>                 (Yes, that is as silly as it sounds.)  I tested that,
>                 and it did optimize as expected.  It's essentially
>                 working around a deficiency in the optimizer around
>                 path constraints.
>
>
>         Once upon a time, there was an optimization for this called
>         predsimplify. I believe Nick can tell you a long, glorious,
>         tragic tale about its life and ultimate death.
>
>
>     Okay, gather 'round the fire. ;-)
>
>     Predsimplify was the first optimization pass I tried to write, to
>     fix PR807. The basis was that it would propagate properties
>     (icmp's between two SSA values) down paths in the domtree. In its
>     various rewrites it knew all sorts of tricks. For instance it
>     could do:
>
>       a = b + c
>       if b == 5:
>         if c == 3:
>           print a   # replace 'a' with 8
>
>     which requires that it solve a bidirectional dataflow problem. It
>     also knew that a < b implied b > 0, and a < b < c implies that c >
>     1. Some of the things predsimplify used to do are now implemented
>     inside gvn's processEquality. Predsimplify also had an odd trick:
>
>       a = phi [0, a.i]
>       b = phi [10, x]
>       if a == 1:
>         # here we know that b == x. (x could still equal 10 though.)
>
>     which I haven't seen replicated elsewhere in llvm.
>
>     I went through many iterations of the implementation trying to
>     minimize the compile time impact. So why predsimplify die?
>
>     The first largest problem is that it rarely fired. It turns out
>     that people don't write code which needs this sort of optimization
>     in C very often, or at least in the llvm test suite. Instcombine
>     already does a ton of optimization that overlapped with
>     predsimplify in its demanded bits analysis, and that does a much
>     better job of getting the cases that show up in real-world code.
>     The cost of the analysis that predsimplify was doing, eagerly
>     building up its value graph was too expensive to fit in llvm,
>     especially when instcombine's demanded bits analysis tended to get
>     the cases already. If your IR looks different, it might be worth
>     adding to the pass pipeline for you.
>
>     I think a future VRP (value range propagation) pass would need to
>     be a superset of simplify-demanded-bits, so that we could remove
>     the cost of running simplify-demanded-bits to help pay for the
>     VRP. I've been thinking about the data structure we could use for
>     that, but haven't worked out all the operations yet.
>
>     Nick
>
>         In particular:
>
>
>                 define void @example_tweaked(i64* %x0) #0 {
>                   %1 = load i64* %x0, align 8
>                   %2 = and i64 %1, 3
>                   %3 = icmp eq i64 %2, 2
>                   br i1 %3, label %4, label %8
>                 ; <label>:4 ; preds = %0
>                   call void @llvm.assume(i1 %3)
>
>
>         We don't need the assume here. If we want to do
>         predicate-based simplification, we can just have <whatever>
>         look at dominating branch conditions, much like it looks at
>         dominating assume calls. Worst case is that it requires more
>         fancy-ness in the assumption tracking infrastructure to
>         actually funnel queries through to different ways of saying
>         "this value is true at this point".
>
>         However, that is the core essence of predsimplify.
>         Historically, we have found a really terrifying amount of
>         complexity and compile time hit for relatively small gains in
>         terms of generated code quality. =/ Maybe we just need to
>         handle the "obvious" cases much like the very limited calls to
>         assumption tracker do, but I think this path needs to be trod
>         with great care.
>
>
>                   %5 = and i64 %1, -4 ; this should be optimized away
>                   %6 = or i64 %5, 2 ; this should be optimized away
>                   %7 = add nsw i64 %6, 12
>                   store i64 %7, i64* %x0, align 8
>                   ret void
>                 ; <label>:8 ; preds = %0
>                   tail call void @go_error() #2
>                   unreachable
>                 }
>
>                 declare void @llvm.assume(i1)
>
>
>>
>>                 Thanks for any ideas,
>>                 /Lars
>>
>>                 /***************************************************/
>>
>>                 /*  The two LSB of x0 are 'tag bits'  */
>>                 /*  that we want to manipulate.     */
>>                 extern long x0;
>>
>>                 void go_error(void) __attribute__ ((noreturn));
>>
>>                 void example_not_optimized(void)
>>                 {
>>                   if((x0 & 3) == 2) {
>>                     // Here the tag bits are removed and added
>>                     // with bitwise 'and' and 'or'.
>>                     x0 = ((x0 & ~3) | 2) + 12;
>>                   } else {
>>                 go_error();
>>                   }
>>                 }
>>
>>                 /*
>>                 define void @example_not_optimized() #0 {
>>                   %1 = load i64* @x0, align 8, !tbaa !1
>>                   %2 = and i64 %1, 3
>>                   %3 = icmp eq i64 %2, 2
>>                   br i1 %3, label %4, label %8
>>
>>                 ; <label>:4           ; preds = %0
>>                   %5 = and i64 %1, -4  ; this should be optimized away
>>                   %6 = or i64 %5, 2              ; this should be
>>                 optimized away
>>                   %7 = add nsw i64 %6, 12
>>                   store i64 %7, i64* @x0, align 8, !tbaa !1
>>                   ret void
>>
>>                 ; <label>:8           ; preds = %0
>>                   tail call void @go_error() #2
>>                 unreachable
>>                 }
>>                 */
>>
>>
>>                 void example_optimized(void)
>>                 {
>>                   if((x0 & 3) == 2) {
>>                     // Here the tag bits are removed and added
>>                     // with subtraction and addition.
>>                     x0 = (x0 - (x0 & 3) + 2) + 12;
>>                   } else {
>>                 go_error();
>>                   }
>>                 }
>>
>>                 /*
>>                 define void @example_optimized() #0 {
>>                   %1 = load i64* @x0, align 8, !tbaa !1
>>                   %2 = and i64 %1, 3
>>                   %3 = icmp eq i64 %2, 2
>>                   br i1 %3, label %4, label %6
>>
>>                 ; <label>:4           ; preds = %0
>>                   %5 = add i64 %1, 12
>>                   store i64 %5, i64* @x0, align 8, !tbaa !1
>>                   ret void
>>
>>                 ; <label>:6           ; preds = %0
>>                   tail call void @go_error() #2
>>                 unreachable
>>                 }
>>
>>                  */
>>
>>
>>
>>                 _______________________________________________
>>                 LLVM Developers mailing list
>>                 LLVMdev at cs.uiuc.edu  <mailto:LLVMdev at cs.uiuc.edu>          http://llvm.cs.uiuc.edu
>>                 http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
>                 _______________________________________________
>                 LLVM Developers mailing list
>                 LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>
>                 http://llvm.cs.uiuc.edu
>                 http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
>
>             _______________________________________________
>             LLVM Developers mailing list
>             LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>
>             http://llvm.cs.uiuc.edu
>             http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
>
>         _______________________________________________
>         LLVM Developers mailing list
>         LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>
>         http://llvm.cs.uiuc.edu
>         http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150116/f685c5c3/attachment.html>