[PATCH] D87769: [ARM][MVE] tail-predication: predicate new checks on force-enabled option
Sjoerd Meijer via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Sep 16 08:44:05 PDT 2020
SjoerdMeijer created this revision.
SjoerdMeijer added reviewers: dmgreen, samparker, samtebbs.
Herald added subscribers: danielkiss, hiraditya, kristof.beyls.
Herald added a project: LLVM.
SjoerdMeijer requested review of this revision.
Additional sanity checks were added to get.active.lane.mask's second argument, the loop tripcount/elementcount, in rG635b87511ec3 <https://reviews.llvm.org/rG635b87511ec3d6d2fa8f65a3ed1876f01367584e>. Like the other (overflow) checks, skip this if tail-predication is forced.
https://reviews.llvm.org/D87769
Files:
llvm/lib/Target/ARM/MVETailPredication.cpp
llvm/test/CodeGen/Thumb2/LowOverheadLoops/cond-vector-reduce-mve-codegen.ll
llvm/test/CodeGen/Thumb2/LowOverheadLoops/iv-two-vcmp.mir
llvm/test/CodeGen/Thumb2/LowOverheadLoops/iv-vcmp.mir
llvm/test/CodeGen/Thumb2/mve-pred-vctpvpsel.ll
Index: llvm/test/CodeGen/Thumb2/mve-pred-vctpvpsel.ll
===================================================================
--- llvm/test/CodeGen/Thumb2/mve-pred-vctpvpsel.ll
+++ llvm/test/CodeGen/Thumb2/mve-pred-vctpvpsel.ll
@@ -19,8 +19,7 @@
; CHECK-NEXT: .LBB0_1: @ %do.body
; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1
; CHECK-NEXT: vldrw.u32 q4, [r0], #16
-; CHECK-NEXT: vcmp.f32 ge, q1, q4
-; CHECK-NEXT: vpstt
+; CHECK-NEXT: vptt.f32 ge, q1, q4
; CHECK-NEXT: vmovt q1, q4
; CHECK-NEXT: vmovt q0, q2
; CHECK-NEXT: vadd.i32 q2, q2, q3
Index: llvm/test/CodeGen/Thumb2/LowOverheadLoops/iv-vcmp.mir
===================================================================
--- llvm/test/CodeGen/Thumb2/LowOverheadLoops/iv-vcmp.mir
+++ llvm/test/CodeGen/Thumb2/LowOverheadLoops/iv-vcmp.mir
@@ -110,8 +110,7 @@
; CHECK: bb.2.vector.body:
; CHECK: successors: %bb.2(0x7c000000), %bb.3(0x04000000)
; CHECK: liveins: $lr, $q0, $q1, $q2, $r0, $r1
- ; CHECK: renamable $vpr = MVE_VCMPu32 renamable $q1, renamable $q0, 8, 0, killed $noreg
- ; CHECK: MVE_VPST 4, implicit $vpr
+ ; CHECK: MVE_VPTv4u32 4, renamable $q1, renamable $q0, 8, implicit-def $vpr
; CHECK: renamable $r1, renamable $q3 = MVE_VLDRWU32_post killed renamable $r1, 16, 1, renamable $vpr :: (load 16 from %ir.lsr.iv35, align 4)
; CHECK: renamable $r0 = MVE_VSTRWU32_post killed renamable $q3, killed renamable $r0, 16, 1, killed renamable $vpr :: (store 16 into %ir.lsr.iv12, align 4)
; CHECK: renamable $q0 = MVE_VADDi32 killed renamable $q0, renamable $q2, 0, $noreg, undef renamable $q0
Index: llvm/test/CodeGen/Thumb2/LowOverheadLoops/iv-two-vcmp.mir
===================================================================
--- llvm/test/CodeGen/Thumb2/LowOverheadLoops/iv-two-vcmp.mir
+++ llvm/test/CodeGen/Thumb2/LowOverheadLoops/iv-two-vcmp.mir
@@ -118,8 +118,7 @@
; CHECK: bb.2.vector.body:
; CHECK: successors: %bb.2(0x7c000000), %bb.3(0x04000000)
; CHECK: liveins: $lr, $q0, $q1, $q2, $q3, $r0, $r1
- ; CHECK: renamable $vpr = MVE_VCMPu32 renamable $q1, renamable $q0, 8, 0, killed $noreg
- ; CHECK: MVE_VPST 2, implicit $vpr
+ ; CHECK: MVE_VPTv4u32 2, renamable $q1, renamable $q0, 8, implicit-def $vpr
; CHECK: renamable $vpr = MVE_VCMPu32 renamable $q0, renamable $q2, 2, 1, killed renamable $vpr
; CHECK: renamable $r1, renamable $q4 = MVE_VLDRWU32_post killed renamable $r1, 16, 1, renamable $vpr :: (load 16 from %ir.lsr.iv35, align 4)
; CHECK: renamable $r0 = MVE_VSTRWU32_post killed renamable $q4, killed renamable $r0, 16, 1, killed renamable $vpr :: (store 16 into %ir.lsr.iv12, align 4)
Index: llvm/test/CodeGen/Thumb2/LowOverheadLoops/cond-vector-reduce-mve-codegen.ll
===================================================================
--- llvm/test/CodeGen/Thumb2/LowOverheadLoops/cond-vector-reduce-mve-codegen.ll
+++ llvm/test/CodeGen/Thumb2/LowOverheadLoops/cond-vector-reduce-mve-codegen.ll
@@ -408,8 +408,7 @@
; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1
; CHECK-NEXT: adds r3, #4
; CHECK-NEXT: vldrw.u32 q0, [r1], #16
-; CHECK-NEXT: vcmp.i32 ne, q0, zr
-; CHECK-NEXT: vpst
+; CHECK-NEXT: vpt.i32 ne, q0, zr
; CHECK-NEXT: vldrwt.u32 q1, [r0]
; CHECK-NEXT: vmul.i32 q0, q1, q0
; CHECK-NEXT: vpst
Index: llvm/lib/Target/ARM/MVETailPredication.cpp
===================================================================
--- llvm/lib/Target/ARM/MVETailPredication.cpp
+++ llvm/lib/Target/ARM/MVETailPredication.cpp
@@ -411,7 +411,7 @@
<< TC2 << " from get.active.lane.mask\n");
return false;
}
- } else {
+ } else if (!ForceTailPredication) {
// Smoke tests if the element count is a runtime value. I.e., this isn't
// fully generic because that would require a full SCEV visitor here. It
// would require extracting the variable from the elementcount SCEV
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D87769.292233.patch
Type: text/x-patch
Size: 3923 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200916/c69a320a/attachment.bin>
More information about the llvm-commits
mailing list