[llvm] r235243 - [AArch64] Don't force MVT::Untyped when selecting LD1LANEpost.
Ahmed Bougacha
ahmed.bougacha at gmail.com
Fri Apr 17 16:43:33 PDT 2015
Author: ab
Date: Fri Apr 17 18:43:33 2015
New Revision: 235243
URL: http://llvm.org/viewvc/llvm-project?rev=235243&view=rev
Log:
[AArch64] Don't force MVT::Untyped when selecting LD1LANEpost.
The result is either an Untyped reg sequence, on ldN with N > 1, or
just the type of the input vector, on ld1. Don't force Untyped.
Instead, just use the type of the reg sequence.
This mirrors the behavior of createTuple, which feeds the LD1*_POST.
The narrow code path wasn't actually covered by tests, because V64
insert_vector_elt are widened to V128 before the LD1LANEpost combine
has the chance to run, usually.
The only case where it does run on V64 vectors is if the vector ops
legalizer ran. So, tickle the code with a ctpop.
Fixes PR23265.
Modified:
llvm/trunk/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
llvm/trunk/test/CodeGen/AArch64/arm64-indexed-vector-ldst.ll
Modified: llvm/trunk/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp?rev=235243&r1=235242&r2=235243&view=diff
==============================================================================
--- llvm/trunk/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp (original)
+++ llvm/trunk/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp Fri Apr 17 18:43:33 2015
@@ -1234,7 +1234,7 @@ SDNode *AArch64DAGToDAGISel::SelectPostL
SDValue RegSeq = createQTuple(Regs);
const EVT ResTys[] = {MVT::i64, // Type of the write back register
- MVT::Untyped, MVT::Other};
+ RegSeq->getValueType(0), MVT::Other};
unsigned LaneNo =
cast<ConstantSDNode>(N->getOperand(NumVecs + 1))->getZExtValue();
Modified: llvm/trunk/test/CodeGen/AArch64/arm64-indexed-vector-ldst.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/arm64-indexed-vector-ldst.ll?rev=235243&r1=235242&r2=235243&view=diff
==============================================================================
--- llvm/trunk/test/CodeGen/AArch64/arm64-indexed-vector-ldst.ll (original)
+++ llvm/trunk/test/CodeGen/AArch64/arm64-indexed-vector-ldst.ll Fri Apr 17 18:43:33 2015
@@ -6193,3 +6193,25 @@ define <4 x float> @test_v4f32_post_reg_
store float* %tmp3, float** %ptr
ret <4 x float> %tmp2
}
+
+; Make sure that we test the narrow V64 code path.
+; The tests above don't, because there, 64-bit insert_vector_elt nodes will be
+; widened to 128-bit before the LD1LANEpost combine has the chance to run,
+; making it avoid narrow vector types.
+; One way to trick that combine into running early is to force the vector ops
+; legalizer to run. We achieve that using the ctpop.
+; PR23265
+define <4 x i16> @test_v4i16_post_reg_ld1lane_forced_narrow(i16* %bar, i16** %ptr, i64 %inc, <4 x i16> %A, <2 x i32>* %d) {
+; CHECK-LABEL: test_v4i16_post_reg_ld1lane_forced_narrow:
+; CHECK: ld1.h { v0 }[1], [x0], x{{[0-9]+}}
+ %tmp1 = load i16, i16* %bar
+ %tmp2 = insertelement <4 x i16> %A, i16 %tmp1, i32 1
+ %tmp3 = getelementptr i16, i16* %bar, i64 %inc
+ store i16* %tmp3, i16** %ptr
+ %dl = load <2 x i32>, <2 x i32>* %d
+ %dr = call <2 x i32> @llvm.ctpop.v2i32(<2 x i32> %dl)
+ store <2 x i32> %dr, <2 x i32>* %d
+ ret <4 x i16> %tmp2
+}
+
+declare <2 x i32> @llvm.ctpop.v2i32(<2 x i32>)
More information about the llvm-commits
mailing list