[LLVMbugs] [Bug 9490] New: llc requires a 64-bit induction var to optimize "fourinarow"

Tue Mar 15 16:58:47 PDT 2011

http://llvm.org/bugs/show_bug.cgi?id=9490

           Summary: llc requires a 64-bit induction var to optimize
                    "fourinarow"
           Product: libraries
           Version: trunk
          Platform: PC
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P
         Component: Common Code Generator Code
        AssignedTo: unassignedbugs at nondot.org
        ReportedBy: atrick at apple.com
                CC: llvmbugs at cs.uiuc.edu

The attached patch by Arnaud Allard de Grandmaison prevents induction
variables from being widened to 64-bits on a 32-bit architecture.

This results in a slowdown for some ARM benchmarks. For example we
have a 25% regression on:
MultiSource/Benchmarks/FreeBench/fourinarow/fourinarow

The culprit is function "value". The following induction variable is
not promoted to 64-bit after applying the "native-type" patch.

--- trunk
INDVARS: New CanIV:   %indvar = phi i64 [ %indvar.next, %B33 ], [ 0, %B15 ]
INDVARS: Rewriting loop exit condition to:
      LHS:  %indvar = phi i64 [ %indvar.next, %B33 ], [ 0, %B15 ]
       op:    !=
      RHS:    21
INDVARS: Rewrote IV '{0,+,1}<nuw><nsw><%B16>'   %0 = zext i32 %i.0 to i64
   into =   %V24 = phi i64 [ %indvar.next, %B33 ], [ 0, %B15 ]
INDVARS: Rewrote IV '{0,+,1}<nuw><nsw><%B16>'   %0 = zext i32 %i.0 to i64
   into =   %V20 = phi i64 [ %indvar.next, %B33 ], [ 0, %B15 ]

--- native-type patch
INDVARS: New CanIV:   %i.0 = phi i32 [ 0, %B15 ], [ %V34, %B33 ]
INDVARS: Rewriting loop exit condition to:
      LHS:  %i.0 = phi i32 [ 0, %B15 ], [ %V34, %B33 ]
       op:    !=
      RHS:    21
INDVARS: Rewrote IV '{0,+,1}<nuw><nsw><%B16>'   %0 = zext i32 %i.0 to i64
   into =   %V24 = phi i64 [ %indvar.next, %B33 ], [ 0, %B15 ]
INDVARS: Rewrote IV '{0,+,1}<nuw><nsw><%B16>'   %0 = zext i32 %i.0 to i64
   into =   %V20 = phi i64 [ %indvar.next, %B33 ], [ 0, %B15 ]

---
I'm still trying to understand why the code we generate is not as
performant when we don't promote the IV to 64-bit. My quick answer is
that LSR does not handle loops as well without a canonical IV. But I
need to investigate further.

-- 
Configure bugmail: http://llvm.org/bugs/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.