[llvm] r326949 - [AArch64] add missing pattern for insert_subvector undef
Sebastian Pop via llvm-commits
llvm-commits at lists.llvm.org
Wed Mar 7 14:07:14 PST 2018
Author: spop
Date: Wed Mar 7 14:07:13 2018
New Revision: 326949
URL: http://llvm.org/viewvc/llvm-project?rev=326949&view=rev
Log:
[AArch64] add missing pattern for insert_subvector undef
The attached testcase started failing after the patch to define
isExtractSubvectorCheap with the following pattern mismatch:
ISEL: Starting pattern match
Initial Opcode index to 85068
Match failed at index 85076
LLVM ERROR: Cannot select: t47: v8i16 = insert_subvector undef:v8i16, t43, Constant:i64<0>
The code generated from llvm/lib/Target/AArch64/AArch64InstrInfo.td
def : Pat<(insert_subvector undef, (v4i16 FPR64:$src), (i32 0)),
(INSERT_SUBREG (v8i16 (IMPLICIT_DEF)), FPR64:$src, dsub)>;
is in ninja/lib/Target/AArch64/AArch64GenDAGISel.inc
At the location of the error it is:
/* 85076*/ OPC_CheckChild2Type, MVT::i32,
And it failed to match the type of operand 2.
Adding another def-pat for i64 fixes the failed def-pat error:
def : Pat<(insert_subvector undef, (v4i16 FPR64:$src), (i64 0)),
(INSERT_SUBREG (v8i16 (IMPLICIT_DEF)), FPR64:$src, dsub)>;
Added:
llvm/trunk/test/CodeGen/AArch64/aarch64-insert-subvector-undef.ll
Modified:
llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.td
Modified: llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.td
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.td?rev=326949&r1=326948&r2=326949&view=diff
==============================================================================
--- llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.td (original)
+++ llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.td Wed Mar 7 14:07:13 2018
@@ -6183,20 +6183,25 @@ def : Pat<(v1i64 (extract_subvector (v2i
// A 64-bit subvector insert to the first 128-bit vector position
// is a subregister copy that needs no instruction.
-def : Pat<(insert_subvector undef, (v1i64 FPR64:$src), (i32 0)),
- (INSERT_SUBREG (v2i64 (IMPLICIT_DEF)), FPR64:$src, dsub)>;
-def : Pat<(insert_subvector undef, (v1f64 FPR64:$src), (i32 0)),
- (INSERT_SUBREG (v2f64 (IMPLICIT_DEF)), FPR64:$src, dsub)>;
-def : Pat<(insert_subvector undef, (v2i32 FPR64:$src), (i32 0)),
- (INSERT_SUBREG (v4i32 (IMPLICIT_DEF)), FPR64:$src, dsub)>;
-def : Pat<(insert_subvector undef, (v2f32 FPR64:$src), (i32 0)),
- (INSERT_SUBREG (v4f32 (IMPLICIT_DEF)), FPR64:$src, dsub)>;
-def : Pat<(insert_subvector undef, (v4i16 FPR64:$src), (i32 0)),
- (INSERT_SUBREG (v8i16 (IMPLICIT_DEF)), FPR64:$src, dsub)>;
-def : Pat<(insert_subvector undef, (v4f16 FPR64:$src), (i32 0)),
- (INSERT_SUBREG (v8f16 (IMPLICIT_DEF)), FPR64:$src, dsub)>;
-def : Pat<(insert_subvector undef, (v8i8 FPR64:$src), (i32 0)),
- (INSERT_SUBREG (v16i8 (IMPLICIT_DEF)), FPR64:$src, dsub)>;
+multiclass InsertSubvectorUndef<ValueType Ty> {
+ def : Pat<(insert_subvector undef, (v1i64 FPR64:$src), (Ty 0)),
+ (INSERT_SUBREG (v2i64 (IMPLICIT_DEF)), FPR64:$src, dsub)>;
+ def : Pat<(insert_subvector undef, (v1f64 FPR64:$src), (Ty 0)),
+ (INSERT_SUBREG (v2f64 (IMPLICIT_DEF)), FPR64:$src, dsub)>;
+ def : Pat<(insert_subvector undef, (v2i32 FPR64:$src), (Ty 0)),
+ (INSERT_SUBREG (v4i32 (IMPLICIT_DEF)), FPR64:$src, dsub)>;
+ def : Pat<(insert_subvector undef, (v2f32 FPR64:$src), (Ty 0)),
+ (INSERT_SUBREG (v4f32 (IMPLICIT_DEF)), FPR64:$src, dsub)>;
+ def : Pat<(insert_subvector undef, (v4i16 FPR64:$src), (Ty 0)),
+ (INSERT_SUBREG (v8i16 (IMPLICIT_DEF)), FPR64:$src, dsub)>;
+ def : Pat<(insert_subvector undef, (v4f16 FPR64:$src), (Ty 0)),
+ (INSERT_SUBREG (v8f16 (IMPLICIT_DEF)), FPR64:$src, dsub)>;
+ def : Pat<(insert_subvector undef, (v8i8 FPR64:$src), (Ty 0)),
+ (INSERT_SUBREG (v16i8 (IMPLICIT_DEF)), FPR64:$src, dsub)>;
+}
+
+defm : InsertSubvectorUndef<i32>;
+defm : InsertSubvectorUndef<i64>;
// Use pair-wise add instructions when summing up the lanes for v2f64, v2i64
// or v2f32.
Added: llvm/trunk/test/CodeGen/AArch64/aarch64-insert-subvector-undef.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/aarch64-insert-subvector-undef.ll?rev=326949&view=auto
==============================================================================
--- llvm/trunk/test/CodeGen/AArch64/aarch64-insert-subvector-undef.ll (added)
+++ llvm/trunk/test/CodeGen/AArch64/aarch64-insert-subvector-undef.ll Wed Mar 7 14:07:13 2018
@@ -0,0 +1,21 @@
+; RUN: llc -mtriple=aarch64-none-linux-gnu -mattr=+neon < %s
+
+; Check that this does not ICE.
+
+ at d = common dso_local local_unnamed_addr global <4 x i16> zeroinitializer, align 8
+
+define <8 x i16> @c(i32 %e) {
+entry:
+ %0 = load <4 x i16>, <4 x i16>* @d, align 8
+ %vminv = tail call i32 @llvm.aarch64.neon.uminv.i32.v4i16(<4 x i16> %0)
+ %1 = trunc i32 %vminv to i16
+ %vecinit3 = insertelement <4 x i16> <i16 undef, i16 undef, i16 0, i16 0>, i16 %1, i32 1
+ %call = tail call <8 x i16> @c(i32 0) #3
+ %vgetq_lane = extractelement <8 x i16> %call, i32 0
+ %vset_lane = insertelement <4 x i16> %vecinit3, i16 %vgetq_lane, i32 0
+ %call4 = tail call i32 bitcast (i32 (...)* @k to i32 (<4 x i16>)*)(<4 x i16> %vset_lane) #3
+ ret <8 x i16> undef
+}
+
+declare i32 @llvm.aarch64.neon.uminv.i32.v4i16(<4 x i16>)
+declare i32 @k(...)
More information about the llvm-commits
mailing list