[LLVMdev] 'loop invariant code motion' and 'Reassociate Expression'

Shuxin Yang shuxin.llvm at gmail.com
Tue Apr 23 10:37:53 PDT 2013


As far as I can understand of the code, the Reassociate tries to achieve 
this result by its "ranking" mechanism.

If it dose not, it is not hard to achieve this result, just restructure 
the expression in a way such that
the earlier definition of the sub-expression is permute earlier in the 
resulting expr.

e.g.
    outer-loop1
        x=
        outer-loop2
           y =

           inner-loop3
               z =
               t = z * y * z

     the RHS is better restructured into "x * y * z", such that x * y 
can moved as early as possible.
Restructuring expr this expr is also able to reduce the critical path 
even if no sub-expr is
moved out of loop.

    By no means can "canonicalization" can achieve this result, because 
as this transformation need
some context. (It seems the name "cannonicalization" is badly abused in 
LLVM; it could refers
to this kind of xform. IMO, cannonicalization is sort of blind but 
consistent restructuring)

    BTW, I did similar stuff before in other compiler, I didn't see any %.

On 4/23/13 6:48 AM, Jeroen Dobbelaere wrote:
> Hi,
>
> I am investigating a performance degradation between llvm-3.1 and llvm-3.2
> (Note: current top-of-tree shows a similar degradation)
>
> One issue I see is the following:
> - 'loop invariant code motion' seems to be depending on the result of the 'reassociate expression' pass:
>
> In the samples below I observer the following behavior:
>
> Both start with the same expression:
>    %add = add i32 %total.0, %next.0
>    %add1 = add i32 %add, 1
>
> -LLVM-3.1 converts this into:
> --after Reassociate expressions:
>    %add = add i32 %next.0, 1
>    %add1 = add i32 %add, %total.0
> -- after Canonicalize natural loops:
>    %add = add i32 %next.0.ph, 1
>    %add1 = add i32 %add, %total.0
> -- and during 'loop invariant code motion':
> LICM hoisting to while.cond.outer:   %add = add i32 %next.0.ph, 1
>
> - LLVM-3.2 converts it into:  (Notice the reversion of %total.0 and %next.0)
> --after Reassociate expressions:
>    %add = add i32 %total.0, 1
>    %add1 = add i32 %add, %next.0
> -- after Canonicalize natural loops:
>    %add = add i32 %total.0, 1
>    %add1 = add i32 %add, %next.0.ph
> -- and during 'loop invariant code motion':
> -> %next.0.ph and '1' are loop invariant, but because those are spreaded, they are not moved any more.
>
>
> Is there a way that 'Reassociate expressions' of 'LICM' can rearrange the expressions (likely after 'Canonicalize natural loops'),
> and group loop-invariant parts of the expression ?
> Or is there another pass that I can/should enable to have this effect ?
>
> Greetings,
>   
> Jeroen Dobbelaere
>
> ---
>
> llvm-3.1
>
>
> while.cond:                                       ; preds = %sw.bb, %sw.bb13, %sw.bb18, %sw.bb23, %if.end, %entry
>    %next.0 = phi i32 [ 0, %entry ], [ %next.0, %if.end ], [ 8, %sw.bb23 ], [ 8, %sw.bb18 ], [ 8, %sw.bb13 ], [ 4, %sw.bb ]
>    %total.0 = phi i32 [ 0, %entry ], [ %total.1, %if.end ], [ %total.1, %sw.bb23 ], [ %total.1, %sw.bb18 ], [ %total.1, %sw.bb13 ], [ %total.1, %sw.bb ]
>    %buf.0 = phi i8* [ null, %entry ], [ %buf.0, %if.end ], [ %4, %sw.bb23 ], [ %3, %sw.bb18 ], [ %2, %sw.bb13 ], [ %1, %sw.bb ]
>    %seed.addr.0 = phi i16 [ %seed, %entry ], [ %inc9, %if.end ], [ %inc9, %sw.bb23 ], [ %inc9, %sw.bb18 ], [ %inc9, %sw.bb13 ], [ %inc9, %sw.bb ]
>    %add = add i32 %total.0, %next.0
>    %add1 = add i32 %add, 1
>    %cmp = icmp ult i32 %add1, %dec
>    br i1 %cmp, label %while.body, label %while.cond29
>
> *** IR Dump After Reassociate expressions ***
> define void @foo(i32 %size, i16 signext %seed, i8* nocapture %p) nounwind {
> entry:
>    %dec = add i32 %size, -1
>    br label %while.cond
>
> while.cond:                                       ; preds = %sw.bb, %sw.bb13, %sw.bb18, %sw.bb23, %if.end, %entry
>    %next.0 = phi i32 [ 0, %entry ], [ %next.0, %if.end ], [ 8, %sw.bb23 ], [ 8, %sw.bb18 ], [ 8, %sw.bb13 ], [ 4, %sw.bb ]
>    %total.0 = phi i32 [ 0, %entry ], [ %total.1, %if.end ], [ %total.1, %sw.bb23 ], [ %total.1, %sw.bb18 ], [ %total.1, %sw.bb13 ], [ %total.1, %sw.bb ]
>    %buf.0 = phi i8* [ null, %entry ], [ %buf.0, %if.end ], [ %4, %sw.bb23 ], [ %3, %sw.bb18 ], [ %2, %sw.bb13 ], [ %1, %sw.bb ]
>    %seed.addr.0 = phi i16 [ %seed, %entry ], [ %inc9, %if.end ], [ %inc9, %sw.bb23 ], [ %inc9, %sw.bb18 ], [ %inc9, %sw.bb13 ], [ %inc9, %sw.bb ]
>    %add = add i32 %next.0, 1
>    %add1 = add i32 %add, %total.0
>    %cmp = icmp ult i32 %add1, %dec
>    br i1 %cmp, label %while.body, label %while.cond29
>
>
> *** IR Dump After Canonicalize natural loops ***
> while.cond:                                       ; preds = %while.cond.outer, %if.end
>    %total.0 = phi i32 [ %total.1, %if.end ], [ %total.0.ph, %while.cond.outer ]
>    %seed.addr.0 = phi i16 [ %inc9, %if.end ], [ %seed.addr.0.ph, %while.cond.outer ]
>    %add = add i32 %next.0.ph, 1
>    %add1 = add i32 %add, %total.0
>    %cmp = icmp ult i32 %add1, %dec
>    br i1 %cmp, label %while.body, label %while.cond29.preheader
> [...]
>
> for.cond.for.end_crit_edge:                       ; preds = %for.body
>    %split = phi i32 [ %inc, %for.body ]
>    br label %for.end
> LICM hoisting to while.cond.outer:   %add = add i32 %next.0.ph, 1
> LICM hoisting to while.cond.outer:   %cmp2 = icmp eq i32 %next.0.ph, 0
> LICM hoisting to while.cond.outer:   %cmp343 = icmp ult i32 0, %next.0.ph
> LICM hoisting to while.cond.outer:   %add740 = or i32 %next.0.ph, 1
> *** IR Dump After Loop Invariant Code Motion ***
> while.cond:                                       ; preds = %while.cond.outer, %if.end
>    %total.0 = phi i32 [ %total.1, %if.end ], [ %total.0.ph, %while.cond.outer ]
>    %seed.addr.0 = phi i16 [ %inc9, %if.end ], [ %seed.addr.0.ph, %while.cond.outer ]
>    %add1 = add i32 %add, %total.0
>    %cmp = icmp ult i32 %add1, %dec
>    br i1 %cmp, label %while.body, label %while.cond29.preheader
> [...]
>
> -------
>
> llvm-3.2
>
>
> while.cond:                                       ; preds = %sw.bb, %sw.bb13, %sw.bb18, %sw.bb23, %if.end, %entry
>    %seed.addr.0 = phi i16 [ %seed, %entry ], [ %inc9, %if.end ], [ %inc9, %sw.bb23 ], [ %inc9, %sw.bb18 ], [ %inc9, %sw.bb13 ], [ %inc9, %sw.bb ]
>    %total.0 = phi i32 [ 0, %entry ], [ %total.1, %if.end ], [ %total.1, %sw.bb23 ], [ %total.1, %sw.bb18 ], [ %total.1, %sw.bb13 ], [ %total.1, %sw.bb ]
>    %next.0 = phi i32 [ 0, %entry ], [ %next.0, %if.end ], [ 8, %sw.bb23 ], [ 8, %sw.bb18 ], [ 8, %sw.bb13 ], [ 4, %sw.bb ]
>    %buf.0 = phi i8* [ null, %entry ], [ %buf.0, %if.end ], [ %4, %sw.bb23 ], [ %3, %sw.bb18 ], [ %2, %sw.bb13 ], [ %1, %sw.bb ]
>    %add = add i32 %total.0, %next.0
>    %add1 = add i32 %add, 1
>    %cmp = icmp ult i32 %add1, %dec
>    br i1 %cmp, label %while.body, label %while.cond29
>
> *** IR Dump After Reassociate expressions ***
> define void @foo(i32 %size, i16 signext %seed, i8* nocapture %p) nounwind {
> entry:
>    %dec = add i32 %size, -1
>    br label %while.cond
>
> while.cond:                                       ; preds = %sw.bb, %sw.bb13, %sw.bb18, %sw.bb23, %if.end, %entry
>    %seed.addr.0 = phi i16 [ %seed, %entry ], [ %inc9, %if.end ], [ %inc9, %sw.bb23 ], [ %inc9, %sw.bb18 ], [ %inc9, %sw.bb13 ], [ %inc9, %sw.bb ]
>    %total.0 = phi i32 [ 0, %entry ], [ %total.1, %if.end ], [ %total.1, %sw.bb23 ], [ %total.1, %sw.bb18 ], [ %total.1, %sw.bb13 ], [ %total.1, %sw.bb ]
>    %next.0 = phi i32 [ 0, %entry ], [ %next.0, %if.end ], [ 8, %sw.bb23 ], [ 8, %sw.bb18 ], [ 8, %sw.bb13 ], [ 4, %sw.bb ]
>    %buf.0 = phi i8* [ null, %entry ], [ %buf.0, %if.end ], [ %4, %sw.bb23 ], [ %3, %sw.bb18 ], [ %2, %sw.bb13 ], [ %1, %sw.bb ]
>    %add = add i32 %total.0, 1
>    %add1 = add i32 %add, %next.0
>    %cmp = icmp ult i32 %add1, %dec
>    br i1 %cmp, label %while.body, label %while.cond29
>
>
> *** IR Dump After Canonicalize natural loops ***
> while.cond:                                       ; preds = %while.cond.outer, %if.end
>    %seed.addr.0 = phi i16 [ %inc9, %if.end ], [ %seed.addr.0.ph, %while.cond.outer ]
>    %total.0 = phi i32 [ %total.1, %if.end ], [ %total.0.ph, %while.cond.outer ]
>    %add = add i32 %total.0, 1
>    %add1 = add i32 %add, %next.0.ph
>    %cmp = icmp ult i32 %add1, %dec
>    br i1 %cmp, label %while.body, label %while.cond29.preheader
> [.....]
> for.cond.for.end_crit_edge:                       ; preds = %for.body
>    %split = phi i32 [ %inc, %for.body ]
>    br label %for.end
> LICM hoisting to while.cond.outer:   %cmp2 = icmp eq i32 %next.0.ph, 0
> LICM hoisting to while.cond.outer:   %cmp366 = icmp ult i32 0, %next.0.ph
> LICM hoisting to while.cond.outer:   %add763 = or i32 %next.0.ph, 1
> *** IR Dump After Loop Invariant Code Motion ***
> while.cond:                                       ; preds = %while.cond.outer, %if.end
>    %seed.addr.0 = phi i16 [ %inc9, %if.end ], [ %seed.addr.0.ph, %while.cond.outer ]
>    %total.0 = phi i32 [ %total.1, %if.end ], [ %total.0.ph, %while.cond.outer ]
>    %add = add i32 %total.0, 1
>    %add1 = add i32 %add, %next.0.ph
>    %cmp = icmp ult i32 %add1, %dec
>    br i1 %cmp, label %while.body, label %while.cond29.preheader
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




More information about the llvm-dev mailing list