[PATCH] An infinite loop bug in DAG Combine about keeping transfering between any_extend and sign_extend
Hao Liu
Hao.Liu at arm.com
Wed Apr 16 02:25:19 PDT 2014
Hi t.p.northover,
Hi Tim and other reviewers,
A simple test case can cause the DAG combiner keeps runing endlessly. To reproduce by using the test case in the patch:
$llc -march=arm64 -debug-only=dagcombine < test.ll
...
Replacing.2 0x3d08ad8: v4i32 = sign_extend 0x3d089d0 [ORD=3] [ID=-3]
With: 0x3d08190: v4i32 = any_extend 0x3d089d0 [ORD=3]
Replacing.3 0x3d08190: v4i32 = any_extend 0x3d089d0 [ORD=3]
With: 0x3d08ad8: v4i32 = sign_extend 0x3d089d0 [ORD=3]
Replacing.2 0x3d08ad8: v4i32 = sign_extend 0x3d089d0 [ORD=3]
With: 0x3d08190: v4i32 = any_extend 0x3d089d0 [ORD=3]
...
It tries to transfer sign_extend to any_extend, after that it will transfer back from any_extend to sign_extend ...
To fix this, one side of transformation should be stopped. As I think it makes no sense to transfer from any_extend to sign_extend ( it replace undefined bits with signed bits), this patch will stop transfering any_extend to sign_extend.
A regression test vselect.ll in ARM64 backend will fail. With this patch, it will generate 3 shift instructions other than 1 SSHLL instruction. To let it pass, I remove a CHECK conditon in vselect.ll. This can be improved in ARM64 backend in the future, so I also add a FIXME in vselect.ll.
Review please.
Thanks,
-Hao
http://reviews.llvm.org/D3397
Files:
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
test/CodeGen/ARM64/2014-04-16-AnInfiniteLoopInDAGCombine.ll
test/CodeGen/ARM64/vselect.ll
Index: lib/CodeGen/SelectionDAG/DAGCombiner.cpp
===================================================================
--- lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -5478,7 +5478,10 @@
}
if (N0.getOpcode() == ISD::SETCC) {
- // aext(setcc) -> sext_in_reg(vsetcc) for vectors.
+ // For vectors:
+ // aext(setcc) -> vsetcc
+ // aext(setcc) -> truncate(vsetcc)
+ // aext(setcc) -> aext(vsetcc)
// Only do this before legalize for now.
if (VT.isVector() && !LegalOperations) {
EVT N0VT = N0.getOperand(0).getValueType();
@@ -5493,19 +5496,14 @@
cast<CondCodeSDNode>(N0.getOperand(2))->get());
// If the desired elements are smaller or larger than the source
// elements we can use a matching integer vector type and then
- // truncate/sign extend
+ // truncate/any extend
else {
- EVT MatchingElementType =
- EVT::getIntegerVT(*DAG.getContext(),
- N0VT.getScalarType().getSizeInBits());
- EVT MatchingVectorType =
- EVT::getVectorVT(*DAG.getContext(), MatchingElementType,
- N0VT.getVectorNumElements());
+ EVT MatchingVectorType = N0VT.changeVectorElementTypeToInteger();
SDValue VsetCC =
DAG.getSetCC(SDLoc(N), MatchingVectorType, N0.getOperand(0),
N0.getOperand(1),
cast<CondCodeSDNode>(N0.getOperand(2))->get());
- return DAG.getSExtOrTrunc(VsetCC, SDLoc(N), VT);
+ return DAG.getAnyExtOrTrunc(VsetCC, SDLoc(N), VT);
}
}
Index: test/CodeGen/ARM64/2014-04-16-AnInfiniteLoopInDAGCombine.ll
===================================================================
--- /dev/null
+++ test/CodeGen/ARM64/2014-04-16-AnInfiniteLoopInDAGCombine.ll
@@ -0,0 +1,23 @@
+; RUN: llc < %s -march=arm64
+
+; This test case tests an infinite loop bug in DAG combiner.
+; It just tries to do the following replacing endlessly:
+; (1) Replacing.3 0x2c509f0: v4i32 = any_extend 0x2c4cd08 [ORD=4]
+; With: 0x2c4d128: v4i32 = sign_extend 0x2c4cd08 [ORD=4]
+;
+; (2) Replacing.2 0x2c4d128: v4i32 = sign_extend 0x2c4cd08 [ORD=4]
+; With: 0x2c509f0: v4i32 = any_extend 0x2c4cd08 [ORD=4]
+; As we think the (2) optimization from SIGN_EXTEND to ANY_EXTEND is
+; an optimization to replace unused bits with undefined bits, we remove
+; the (1) optimization (It doesn't make sense to replace undefined bits
+; with signed bits).
+
+define <4 x i32> @infiniteLoop(<4 x i32> %in0, <4 x i16> %in1) {
+entry:
+ %cmp.i = icmp sge <4 x i16> %in1, <i16 32767, i16 32767, i16 -1, i16 -32768>
+ %sext.i = sext <4 x i1> %cmp.i to <4 x i32>
+ %mul.i = mul <4 x i32> %in0, %sext.i
+ %sext = shl <4 x i32> %mul.i, <i32 16, i32 16, i32 16, i32 16>
+ %vmovl.i.i = ashr <4 x i32> %sext, <i32 16, i32 16, i32 16, i32 16>
+ ret <4 x i32> %vmovl.i.i
+}
\ No newline at end of file
Index: test/CodeGen/ARM64/vselect.ll
===================================================================
--- test/CodeGen/ARM64/vselect.ll
+++ test/CodeGen/ARM64/vselect.ll
@@ -2,7 +2,14 @@
;CHECK: @func63
;CHECK: cmeq.4h v0, v0, v1
-;CHECK: sshll.4s v0, v0, #0
+
+;FIXME: currently, it will generate 3 instructions:
+; ushll.4s v0, v0, #0
+; shl.4s v0, v0, #31
+; sshr.4s v0, v0, #31
+;But these instrucitons can be optimized into 1 instruction:
+; sshll.4s v0, v0, #0
+
;CHECK: bsl.16b v0, v2, v3
;CHECK: str q0, [x0]
;CHECK: ret
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D3397.1.patch
Type: text/x-patch
Size: 3522 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140416/a21fea3b/attachment.bin>
More information about the llvm-commits
mailing list