[LLVMbugs] [Bug 866] NEW: LSR should use expressions computed inside the loop to materialize values computed outside the loop

bugzilla-daemon at cs.uiuc.edu bugzilla-daemon at cs.uiuc.edu
Wed Aug 2 23:41:16 PDT 2006


http://llvm.org/bugs/show_bug.cgi?id=866

           Summary: LSR should use expressions computed inside the loop to
                    materialize values computed outside the loop
           Product: libraries
           Version: 1.0
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: Scalar Optimizations
        AssignedTo: unassignedbugs at nondot.org
        ReportedBy: sabre at nondot.org


This is a special case of Bug 863. Consider this:

int %foo(int* %a, int %t, int %C) {
entry:
        br label %cond_true

cond_true:              ; preds = %cond_true, %entry
        %x.0.0 = phi int [ 0, %entry ], [ %tmp9, %cond_true ]           ; <int> [#uses=3]
        %t_addr.0.0 = phi int [ %t, %entry ], [ %tmp7, %cond_true ]             ; <int> [#uses=1]
        %tmp2 = shl int %x.0.0, ubyte 2               ; <int*> [#uses=1]
        %tmp3 = cast int %tmp2 to int         ; <int> [#uses=1]
        %tmp5 = add int %t_addr.0.0, %t_addr.0.0             ; <int> [#uses=1]
        %tmp7 = add int %tmp5, %tmp3            ; <int> [#uses=2]
        %tmp9 = add int %x.0.0, 1               ; <int> [#uses=2]
        %tmp = setgt int %C, 39              ; <bool> [#uses=1]
        br bool %tmp, label %bb12, label %cond_true

bb12:           ; preds = %cond_true
        ret int %tmp7
}

There are two uses of tmp7: the ret and the PHI.  Because of the phi, the value of tmp7 has to be 
computed inside the loop for the next iteration.  However, LSR doesn't keep track of this and ends up 
recomputing the value outside the loop, producing this code:

int %foo(int* %a, int %t, int %C) {
entry:
        br label %cond_true

cond_true:              ; preds = %cond_true, %entry
        %iv. = phi uint [ 0, %entry ], [ %iv..inc, %cond_true ]         ; <uint> [#uses=2]
        %t_addr.0.0 = phi int [ %t, %entry ], [ %tmp.1, %cond_true ]            ; <int> [#uses=1]
        %t_addr.0.0 = cast int %t_addr.0.0 to uint              ; <uint> [#uses=2]
        %tmp = setgt int %C, 39         ; <bool> [#uses=1]
        %iv..inc = add uint %iv., 4             ; <uint> [#uses=2]
        %tmp. = mul uint %t_addr.0.0, 2         ; <uint> [#uses=1]
        %tmp.1 = add uint %iv., %tmp.           ; <uint> [#uses=1]
        %tmp.1 = cast uint %tmp.1 to int                ; <int> [#uses=1]
        br bool %tmp, label %bb12, label %cond_true

bb12:           ; preds = %cond_true
***    %tmp.2 = mul uint %t_addr.0.0, 2                ; <uint> [#uses=1]
***    %tmp.3 = add uint %iv..inc, %tmp.2              ; <uint> [#uses=1]
***    %tmp.4 = add uint %tmp.3, 4294967292            ; <uint> [#uses=1]
***    %tmp.4 = cast uint %tmp.4 to int                ; <int> [#uses=1]
        ret int %tmp.4
}

Note that LSR is tricky: because it is going to reinsert the entire expression it uses the post-
incremented ivar value to avoid coallescing issues.  It would have been much better for it to just realize 
that it could "ret int %tmp %tmp.1" instead of all the ***'d code.  Because it uses the the preinc value 
inside the loop and the postinc value outside the loop, there is no way normal CSE will catch it, and we 
don't.  From this, we generate this nasty code:

_foo:
        li r2, 0
LBB1_1: ;cond_true
        mr r3, r4
        slwi r4, r3, 1
        cmpwi cr0, r5, 40
        add r4, r2, r4
        addi r2, r2, 4
        blt cr0, LBB1_1 ;cond_true
LBB1_2: ;bb12
***    slwi r3, r3, 1
***    add r2, r2, r3
***    addi r3, r2, -4
        blr

doh. mr r3, r2 would have been much faster, and would have probably eliminated a copy in the loop.

-Chris



------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.



More information about the llvm-bugs mailing list