[llvm] r278658 - [LSR] Don't try and create post-inc expressions on non-rotated loops

Mon Aug 15 11:48:04 PDT 2016

Hi,

Surprised I didn’t get failmail about this - is it because WebAssembly is an experimental target?

Test update is probably the right thing to do - when people write loops manually in IR they tend to write them head-tested, which would trigger this heuristic change. It’s simply changing from using a post-inc value to a pre-inc value which decreases register pressure and is an all-round good thing, but does cause some test churn.

Again, I’m worried that I didn’t notice this… :/

Cheers,

James
On 15 Aug 2016, at 19:42, Reid Kleckner <rnk at google.com<mailto:rnk at google.com>> wrote:

This broke test/CodeGen/WebAssembly/cfg-stackify.ll.

Is this correct behavior? Should I update the test?

On Mon, Aug 15, 2016 at 12:53 AM, James Molloy via llvm-commits <llvm-commits at lists.llvm.org<mailto:llvm-commits at lists.llvm.org>> wrote:
Author: jamesm
Date: Mon Aug 15 02:53:03 2016
New Revision: 278658

URL: http://llvm.org/viewvc/llvm-project?rev=278658&view=rev
Log:
[LSR] Don't try and create post-inc expressions on non-rotated loops

If a loop is not rotated (for example when optimizing for size), the latch is not the backedge. If we promote an expression to post-inc form, we not only increase register pressure and add a COPY for that IV expression but for all IVs!

Motivating testcase:

    void f(float *a, float *b, float *c, int n) {
      while (n-- > 0)
        *c++ = *a++ + *b++;
    }

It's imperative that the pointer increments be located in the latch block and not the header block; if not, we cannot use post-increment loads and stores and we have to keep both the post-inc and pre-inc values around until the end of the latch which bloats register usage.

Added:
    llvm/trunk/test/Transforms/LoopStrengthReduce/post-inc-optsize.ll
Modified:
    llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp
    llvm/trunk/test/CodeGen/AMDGPU/wqm.ll
    llvm/trunk/test/CodeGen/ARM/2011-03-23-PeepholeBug.ll
    llvm/trunk/test/CodeGen/Hexagon/hwloop-crit-edge.ll
    llvm/trunk/test/CodeGen/Hexagon/hwloop-loop1.ll
    llvm/trunk/test/CodeGen/X86/lsr-loop-exit-cond.ll

Modified: llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp?rev=278658&r1=278657&r2=278658&view=diff
==============================================================================

--- llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp (original)
+++ llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp Mon Aug 15 02:53:03 2016
@@ -2069,10 +2069,30 @@ void
 LSRInstance::OptimizeLoopTermCond() {
   SmallPtrSet<Instruction *, 4> PostIncs;

+  // We need a different set of heuristics for rotated and non-rotated loops.
+  // If a loop is rotated then the latch is also the backedge, so inserting
+  // post-inc expressions just before the latch is ideal. To reduce live ranges
+  // it also makes sense to rewrite terminating conditions to use post-inc
+  // expressions.
+  //
+  // If the loop is not rotated then the latch is not a backedge; the latch
+  // check is done in the loop head. Adding post-inc expressions before the
+  // latch will cause overlapping live-ranges of pre-inc and post-inc expressions
+  // in the loop body. In this case we do *not* want to use post-inc expressions
+  // in the latch check, and we want to insert post-inc expressions before
+  // the backedge.
   BasicBlock *LatchBlock = L->getLoopLatch();
   SmallVector<BasicBlock*, 8> ExitingBlocks;
   L->getExitingBlocks(ExitingBlocks);
+  if (llvm::all_of(ExitingBlocks, [&LatchBlock](const BasicBlock *BB) {
+        return LatchBlock != BB;
+      })) {
+    // The backedge doesn't exit the loop; treat this as a head-tested loop.
+    IVIncInsertPos = LatchBlock->getTerminator();
+    return;
+  }

+  // Otherwise treat this as a rotated loop.
   for (BasicBlock *ExitingBlock : ExitingBlocks) {

     // Get the terminating condition for the loop if possible.  If we

Modified: llvm/trunk/test/CodeGen/AMDGPU/wqm.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/wqm.ll?rev=278658&r1=278657&r2=278658&view=diff
==============================================================================
--- llvm/trunk/test/CodeGen/AMDGPU/wqm.ll (original)
+++ llvm/trunk/test/CodeGen/AMDGPU/wqm.ll Mon Aug 15 02:53:03 2016
@@ -343,11 +343,12 @@ main_body:
 ; CHECK: s_and_b64 exec, exec, [[LIVE]]
 ; CHECK: image_store
 ; CHECK: s_wqm_b64 exec, exec
-; CHECK: v_mov_b32_e32 [[CTR:v[0-9]+]], -2
+; CHECK: v_mov_b32_e32 [[CTR:v[0-9]+]], 0
 ; CHECK: s_branch [[LOOPHDR:BB[0-9]+_[0-9]+]]

-; CHECK: [[LOOPHDR]]: ; %loop
 ; CHECK: v_add_i32_e32 [[CTR]], vcc, 2, [[CTR]]
+
+; CHECK: [[LOOPHDR]]: ; %loop
 ; CHECK: v_cmp_lt_i32_e32 vcc, 7, [[CTR]]
 ; CHECK: s_cbranch_vccz
 ; CHECK: ; %break

Modified: llvm/trunk/test/CodeGen/ARM/2011-03-23-PeepholeBug.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/2011-03-23-PeepholeBug.ll?rev=278658&r1=278657&r2=278658&view=diff
==============================================================================
--- llvm/trunk/test/CodeGen/ARM/2011-03-23-PeepholeBug.ll (original)
+++ llvm/trunk/test/CodeGen/ARM/2011-03-23-PeepholeBug.ll Mon Aug 15 02:53:03 2016
@@ -18,13 +18,14 @@ bb:
   br i1 %1, label %bb3, label %bb1

 bb1:                                              ; preds = %bb
+; CHECK: bb1
+; CHECK: subs [[REG:r[0-9]+]], #1
   %tmp = tail call i32 @puts() nounwind
   %indvar.next = add i32 %indvar, 1
   br label %bb2

 bb2:                                              ; preds = %bb1, %entry
 ; CHECK: bb2
-; CHECK: subs [[REG:r[0-9]+]], #1
 ; CHECK: cmp [[REG]], #0
 ; CHECK: ble
   %indvar = phi i32 [ %indvar.next, %bb1 ], [ 0, %entry ]

Modified: llvm/trunk/test/CodeGen/Hexagon/hwloop-crit-edge.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/Hexagon/hwloop-crit-edge.ll?rev=278658&r1=278657&r2=278658&view=diff
==============================================================================
--- llvm/trunk/test/CodeGen/Hexagon/hwloop-crit-edge.ll (original)
+++ llvm/trunk/test/CodeGen/Hexagon/hwloop-crit-edge.ll Mon Aug 15 02:53:03 2016
@@ -1,4 +1,5 @@
 ; RUN: llc -O3 -march=hexagon -mcpu=hexagonv5 < %s | FileCheck %s
+; XFAIL: *
 ;
 ; Generate hardware loop when loop 'latch' block is different
 ; from the loop 'exiting' block.

Modified: llvm/trunk/test/CodeGen/Hexagon/hwloop-loop1.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/Hexagon/hwloop-loop1.ll?rev=278658&r1=278657&r2=278658&view=diff
==============================================================================
--- llvm/trunk/test/CodeGen/Hexagon/hwloop-loop1.ll (original)
+++ llvm/trunk/test/CodeGen/Hexagon/hwloop-loop1.ll Mon Aug 15 02:53:03 2016
@@ -2,8 +2,6 @@
 ;
 ; Generate loop1 instruction for double loop sequence.

-; CHECK: loop0(.LBB{{.}}_{{.}}, #100)
-; CHECK: endloop0
 ; CHECK: loop1(.LBB{{.}}_{{.}}, #100)
 ; CHECK: loop0(.LBB{{.}}_{{.}}, #100)
 ; CHECK: endloop0

Modified: llvm/trunk/test/CodeGen/X86/lsr-loop-exit-cond.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/lsr-loop-exit-cond.ll?rev=278658&r1=278657&r2=278658&view=diff
==============================================================================
--- llvm/trunk/test/CodeGen/X86/lsr-loop-exit-cond.ll (original)
+++ llvm/trunk/test/CodeGen/X86/lsr-loop-exit-cond.ll Mon Aug 15 02:53:03 2016
@@ -3,12 +3,12 @@

 ; CHECK-LABEL: t:
 ; CHECK: movl (%r9,%rax,4), %e{{..}}
-; CHECK-NEXT: decq
+; CHECK-NEXT: testq
 ; CHECK-NEXT: jne

 ; ATOM-LABEL: t:
 ; ATOM: movl (%r9,%r{{.+}},4), %e{{..}}
-; ATOM-NEXT: decq
+; ATOM-NEXT: testq
 ; ATOM-NEXT: jne

 @Te0 = external global [256 x i32]             ; <[256 x i32]*> [#uses=5]

Added: llvm/trunk/test/Transforms/LoopStrengthReduce/post-inc-optsize.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopStrengthReduce/post-inc-optsize.ll?rev=278658&view=auto
==============================================================================
--- llvm/trunk/test/Transforms/LoopStrengthReduce/post-inc-optsize.ll (added)
+++ llvm/trunk/test/Transforms/LoopStrengthReduce/post-inc-optsize.ll Mon Aug 15 02:53:03 2016
@@ -0,0 +1,43 @@
+; RUN: opt < %s -loop-reduce -S | FileCheck %s
+
+target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"
+target triple = "thumbv7m-arm-none-eabi"
+
+; Check that the IV updates (incdec.ptr{,1,2}) are kept in the latch block
+; and not moved to the header/exiting block. Inserting them in the header
+; doubles register pressure and adds moves.
+
+; CHECK-LABEL: @f
+; CHECK: while.cond:
+; CHECK: icmp sgt i32 %n.addr.0, 0
+; CHECK: while.body:
+; CHECK: incdec.ptr =
+; CHECK: incdec.ptr1 =
+; CHECK: incdec.ptr2 =
+; CHECK: dec =
+define void @f(float* nocapture readonly %a, float* nocapture readonly %b, float* nocapture %c, i32 %n) {
+entry:
+  br label %while.cond
+
+while.cond:                                       ; preds = %while.body, %entry
+  %a.addr.0 = phi float* [ %a, %entry ], [ %incdec.ptr, %while.body ]
+  %b.addr.0 = phi float* [ %b, %entry ], [ %incdec.ptr1, %while.body ]
+  %c.addr.0 = phi float* [ %c, %entry ], [ %incdec.ptr2, %while.body ]
+  %n.addr.0 = phi i32 [ %n, %entry ], [ %dec, %while.body ]
+  %cmp = icmp sgt i32 %n.addr.0, 0
+  br i1 %cmp, label %while.body, label %while.end
+
+while.body:                                       ; preds = %while.cond
+  %incdec.ptr = getelementptr inbounds float, float* %a.addr.0, i32 1
+  %tmp = load float, float* %a.addr.0, align 4
+  %incdec.ptr1 = getelementptr inbounds float, float* %b.addr.0, i32 1
+  %tmp1 = load float, float* %b.addr.0, align 4
+  %add = fadd float %tmp, %tmp1
+  %incdec.ptr2 = getelementptr inbounds float, float* %c.addr.0, i32 1
+  store float %add, float* %c.addr.0, align 4
+  %dec = add nsw i32 %n.addr.0, -1
+  br label %while.cond
+
+while.end:                                        ; preds = %while.cond
+  ret void
+}


_______________________________________________
llvm-commits mailing list
llvm-commits at lists.llvm.org<mailto:llvm-commits at lists.llvm.org>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits


IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160815/7aa9ff18/attachment.html>