[LLVMdev] generate llvm.assume calls in GVN?

Nick Lewycky nlewycky at google.com
Thu Jan 15 14:42:02 PST 2015


On 15 January 2015 at 10:49, Chandler Carruth <chandlerc at google.com> wrote:

> On Thu, Jan 15, 2015 at 10:30 AM, Sanjay Patel <spatel at rotateright.com>
> wrote:
>
>> Would it be wrong to generate the llvm.assume IR suggested below? in GVN?
>>
>
> I think so... Because:
>
> One very small tweak you could make would be to add an llvm.assume inside
>>> the if case of the if condition.  (Yes, that is as silly as it sounds.)  I
>>> tested that, and it did optimize as expected.  It's essentially working
>>> around a deficiency in the optimizer around path constraints.
>>>
>>
> Once upon a time, there was an optimization for this called predsimplify.
> I believe Nick can tell you a long, glorious, tragic tale about its life
> and ultimate death.
>

Okay, gather 'round the fire. ;-)

Predsimplify was the first optimization pass I tried to write, to fix
PR807. The basis was that it would propagate properties (icmp's between two
SSA values) down paths in the domtree. In its various rewrites it knew all
sorts of tricks. For instance it could do:

  a = b + c
  if b == 5:
    if c == 3:
      print a   # replace 'a' with 8

which requires that it solve a bidirectional dataflow problem. It also knew
that a < b implied b > 0, and a < b < c implies that c > 1. Some of the
things predsimplify used to do are now implemented inside gvn's
processEquality. Predsimplify also had an odd trick:

  a = phi [0, a.i]
  b = phi [10, x]
  if a == 1:
    # here we know that b == x. (x could still equal 10 though.)

which I haven't seen replicated elsewhere in llvm.

I went through many iterations of the implementation trying to minimize the
compile time impact. So why predsimplify die?

The first largest problem is that it rarely fired. It turns out that people
don't write code which needs this sort of optimization in C very often, or
at least in the llvm test suite. Instcombine already does a ton of
optimization that overlapped with predsimplify in its demanded bits
analysis, and that does a much better job of getting the cases that show up
in real-world code. The cost of the analysis that predsimplify was doing,
eagerly building up its value graph was too expensive to fit in llvm,
especially when instcombine's demanded bits analysis tended to get the
cases already. If your IR looks different, it might be worth adding to the
pass pipeline for you.

I think a future VRP (value range propagation) pass would need to be a
superset of simplify-demanded-bits, so that we could remove the cost of
running simplify-demanded-bits to help pay for the VRP. I've been thinking
about the data structure we could use for that, but haven't worked out all
the operations yet.

Nick

In particular:
>
>
>>> define void @example_tweaked(i64* %x0) #0 {
>>>   %1 = load i64* %x0, align 8
>>>   %2 = and i64 %1, 3
>>>   %3 = icmp eq i64 %2, 2
>>>   br i1 %3, label %4, label %8
>>> ; <label>:4                                       ; preds = %0
>>>   call void @llvm.assume(i1 %3)
>>>
>>
> We don't need the assume here. If we want to do predicate-based
> simplification, we can just have <whatever> look at dominating branch
> conditions, much like it looks at dominating assume calls. Worst case is
> that it requires more fancy-ness in the assumption tracking infrastructure
> to actually funnel queries through to different ways of saying "this value
> is true at this point".
>
> However, that is the core essence of predsimplify. Historically, we have
> found a really terrifying amount of complexity and compile time hit for
> relatively small gains in terms of generated code quality. =/ Maybe we just
> need to handle the "obvious" cases much like the very limited calls to
> assumption tracker do, but I think this path needs to be trod with great
> care.
>
>
>>>   %5 = and i64 %1, -4                ; this should be optimized away
>>>   %6 = or i64 %5, 2                  ; this should be optimized away
>>>   %7 = add nsw i64 %6, 12
>>>   store i64 %7, i64* %x0, align 8
>>>   ret void
>>> ; <label>:8                                       ; preds = %0
>>>   tail call void @go_error() #2
>>>   unreachable
>>> }
>>>
>>> declare void @llvm.assume(i1)
>>>
>>>
>>>
>>>  Thanks for any ideas,
>>> /Lars
>>>
>>> /***************************************************/
>>>
>>>  /*  The two LSB of x0 are 'tag bits'  */
>>> /*  that we want to manipulate.       */
>>> extern long x0;
>>>
>>>  void go_error(void) __attribute__ ((noreturn));
>>>
>>>  void example_not_optimized(void)
>>> {
>>>   if((x0 & 3) == 2) {
>>>     // Here the tag bits are removed and added
>>>     // with bitwise 'and' and 'or'.
>>>     x0 = ((x0 & ~3) | 2) + 12;
>>>   } else {
>>>     go_error();
>>>   }
>>> }
>>>
>>>  /*
>>> define void @example_not_optimized() #0 {
>>>   %1 = load i64* @x0, align 8, !tbaa !1
>>>   %2 = and i64 %1, 3
>>>   %3 = icmp eq i64 %2, 2
>>>   br i1 %3, label %4, label %8
>>>
>>>  ; <label>:4                                       ; preds = %0
>>>   %5 = and i64 %1, -4                ; this should be optimized away
>>>   %6 = or i64 %5, 2                  ; this should be optimized away
>>>   %7 = add nsw i64 %6, 12
>>>   store i64 %7, i64* @x0, align 8, !tbaa !1
>>>   ret void
>>>
>>>  ; <label>:8                                       ; preds = %0
>>>   tail call void @go_error() #2
>>>   unreachable
>>> }
>>> */
>>>
>>>
>>>  void example_optimized(void)
>>> {
>>>   if((x0 & 3) == 2) {
>>>     // Here the tag bits are removed and added
>>>     // with subtraction and addition.
>>>     x0 = (x0 - (x0 & 3) + 2) + 12;
>>>   } else {
>>>     go_error();
>>>   }
>>> }
>>>
>>>  /*
>>> define void @example_optimized() #0 {
>>>   %1 = load i64* @x0, align 8, !tbaa !1
>>>   %2 = and i64 %1, 3
>>>   %3 = icmp eq i64 %2, 2
>>>   br i1 %3, label %4, label %6
>>>
>>>  ; <label>:4                                       ; preds = %0
>>>   %5 = add i64 %1, 12
>>>   store i64 %5, i64* @x0, align 8, !tbaa !1
>>>   ret void
>>>
>>>  ; <label>:6                                       ; preds = %0
>>>   tail call void @go_error() #2
>>>   unreachable
>>> }
>>>
>>>   */
>>>
>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing listLLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.eduhttp://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>
>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>
>>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150115/6d396d58/attachment.html>


More information about the llvm-dev mailing list