[LLVMdev] generate llvm.assume calls in GVN?

Fri Jan 16 08:46:30 PST 2015

Thanks, LLVM Ol' Timers! Your ghost stories are scary enough that I'll
tread quietly past predsimplify's grave.

I was peeking into this because - and I'm happy to be corrected if I've
misdiagnosed these - we've seen a few VRP-related bugs cropping up lately:
1. http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-November/079038.html
(remove alignment checks)
2. http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-December/079345.html
(remove tag bit tests)
3. http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-January/080314.html
(remove FP range checks, patch submitted)
4. http://llvm.org/bugs/show_bug.cgi?id=17683 (range check to improve
division)
5. http://llvm.org/bugs/show_bug.cgi?id=17713 (propagate FP equalities,
patch committed)

On Thu, Jan 15, 2015 at 3:42 PM, Nick Lewycky <nlewycky at google.com> wrote:

> On 15 January 2015 at 10:49, Chandler Carruth <chandlerc at google.com>
> wrote:
>
>> On Thu, Jan 15, 2015 at 10:30 AM, Sanjay Patel <spatel at rotateright.com>
>> wrote:
>>
>>> Would it be wrong to generate the llvm.assume IR suggested below? in GVN?
>>>
>>
>> I think so... Because:
>>
>> One very small tweak you could make would be to add an llvm.assume inside
>>>> the if case of the if condition.  (Yes, that is as silly as it sounds.)  I
>>>> tested that, and it did optimize as expected.  It's essentially working
>>>> around a deficiency in the optimizer around path constraints.
>>>>
>>>
>> Once upon a time, there was an optimization for this called predsimplify.
>> I believe Nick can tell you a long, glorious, tragic tale about its life
>> and ultimate death.
>>
>
> Okay, gather 'round the fire. ;-)
>
> Predsimplify was the first optimization pass I tried to write, to fix
> PR807. The basis was that it would propagate properties (icmp's between two
> SSA values) down paths in the domtree. In its various rewrites it knew all
> sorts of tricks. For instance it could do:
>
>   a = b + c
>   if b == 5:
>     if c == 3:
>       print a   # replace 'a' with 8
>
> which requires that it solve a bidirectional dataflow problem. It also
> knew that a < b implied b > 0, and a < b < c implies that c > 1. Some of
> the things predsimplify used to do are now implemented inside gvn's
> processEquality. Predsimplify also had an odd trick:
>
>   a = phi [0, a.i]
>   b = phi [10, x]
>   if a == 1:
>     # here we know that b == x. (x could still equal 10 though.)
>
> which I haven't seen replicated elsewhere in llvm.
>
> I went through many iterations of the implementation trying to minimize
> the compile time impact. So why predsimplify die?
>
> The first largest problem is that it rarely fired. It turns out that
> people don't write code which needs this sort of optimization in C very
> often, or at least in the llvm test suite. Instcombine already does a ton
> of optimization that overlapped with predsimplify in its demanded bits
> analysis, and that does a much better job of getting the cases that show up
> in real-world code. The cost of the analysis that predsimplify was doing,
> eagerly building up its value graph was too expensive to fit in llvm,
> especially when instcombine's demanded bits analysis tended to get the
> cases already. If your IR looks different, it might be worth adding to the
> pass pipeline for you.
>
> I think a future VRP (value range propagation) pass would need to be a
> superset of simplify-demanded-bits, so that we could remove the cost of
> running simplify-demanded-bits to help pay for the VRP. I've been thinking
> about the data structure we could use for that, but haven't worked out all
> the operations yet.
>
> Nick
>
> In particular:
>>
>>
>>>> define void @example_tweaked(i64* %x0) #0 {
>>>>   %1 = load i64* %x0, align 8
>>>>   %2 = and i64 %1, 3
>>>>   %3 = icmp eq i64 %2, 2
>>>>   br i1 %3, label %4, label %8
>>>> ; <label>:4                                       ; preds = %0
>>>>   call void @llvm.assume(i1 %3)
>>>>
>>>
>> We don't need the assume here. If we want to do predicate-based
>> simplification, we can just have <whatever> look at dominating branch
>> conditions, much like it looks at dominating assume calls. Worst case is
>> that it requires more fancy-ness in the assumption tracking infrastructure
>> to actually funnel queries through to different ways of saying "this value
>> is true at this point".
>>
>> However, that is the core essence of predsimplify. Historically, we have
>> found a really terrifying amount of complexity and compile time hit for
>> relatively small gains in terms of generated code quality. =/ Maybe we just
>> need to handle the "obvious" cases much like the very limited calls to
>> assumption tracker do, but I think this path needs to be trod with great
>> care.
>>
>>
>>>>   %5 = and i64 %1, -4                ; this should be optimized away
>>>>   %6 = or i64 %5, 2                  ; this should be optimized away
>>>>   %7 = add nsw i64 %6, 12
>>>>   store i64 %7, i64* %x0, align 8
>>>>   ret void
>>>> ; <label>:8                                       ; preds = %0
>>>>   tail call void @go_error() #2
>>>>   unreachable
>>>> }
>>>>
>>>> declare void @llvm.assume(i1)
>>>>
>>>>
>>>>
>>>>  Thanks for any ideas,
>>>> /Lars
>>>>
>>>> /***************************************************/
>>>>
>>>>  /*  The two LSB of x0 are 'tag bits'  */
>>>> /*  that we want to manipulate.       */
>>>> extern long x0;
>>>>
>>>>  void go_error(void) __attribute__ ((noreturn));
>>>>
>>>>  void example_not_optimized(void)
>>>> {
>>>>   if((x0 & 3) == 2) {
>>>>     // Here the tag bits are removed and added
>>>>     // with bitwise 'and' and 'or'.
>>>>     x0 = ((x0 & ~3) | 2) + 12;
>>>>   } else {
>>>>     go_error();
>>>>   }
>>>> }
>>>>
>>>>  /*
>>>> define void @example_not_optimized() #0 {
>>>>   %1 = load i64* @x0, align 8, !tbaa !1
>>>>   %2 = and i64 %1, 3
>>>>   %3 = icmp eq i64 %2, 2
>>>>   br i1 %3, label %4, label %8
>>>>
>>>>  ; <label>:4                                       ; preds = %0
>>>>   %5 = and i64 %1, -4                ; this should be optimized away
>>>>   %6 = or i64 %5, 2                  ; this should be optimized away
>>>>   %7 = add nsw i64 %6, 12
>>>>   store i64 %7, i64* @x0, align 8, !tbaa !1
>>>>   ret void
>>>>
>>>>  ; <label>:8                                       ; preds = %0
>>>>   tail call void @go_error() #2
>>>>   unreachable
>>>> }
>>>> */
>>>>
>>>>
>>>>  void example_optimized(void)
>>>> {
>>>>   if((x0 & 3) == 2) {
>>>>     // Here the tag bits are removed and added
>>>>     // with subtraction and addition.
>>>>     x0 = (x0 - (x0 & 3) + 2) + 12;
>>>>   } else {
>>>>     go_error();
>>>>   }
>>>> }
>>>>
>>>>  /*
>>>> define void @example_optimized() #0 {
>>>>   %1 = load i64* @x0, align 8, !tbaa !1
>>>>   %2 = and i64 %1, 3
>>>>   %3 = icmp eq i64 %2, 2
>>>>   br i1 %3, label %4, label %6
>>>>
>>>>  ; <label>:4                                       ; preds = %0
>>>>   %5 = add i64 %1, 12
>>>>   store i64 %5, i64* @x0, align 8, !tbaa !1
>>>>   ret void
>>>>
>>>>  ; <label>:6                                       ; preds = %0
>>>>   tail call void @go_error() #2
>>>>   unreachable
>>>> }
>>>>
>>>>   */
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing listLLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.eduhttp://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>
>>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>
>>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150116/6d8e2d0d/attachment.html>