[PATCH] D98714: [SLP] Add insertelement instructions to vectorizable tree

Wed Mar 17 09:17:58 PDT 2021

RKSimon added inline comments.

================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:1578
     /// (either with vector instruction or with scatter/gather
     /// intrinsics for store/load)?
+    enum EntryState { Vectorize, ScatterVectorize, NeedToGather, MayBeRemoved };
----------------
Update comment

================
Comment at: llvm/test/Transforms/SLPVectorizer/AMDGPU/bswap-inseltpoison.ll:10

 ; GFX8: call <2 x i16> @llvm.bswap.v2i16(
 define <2 x i16> @bswap_v2i16(<2 x i16> %arg) {
----------------
Regenerate + commit these files in trunk first - you will need to remove these manually added CHECK lines otherwise FileCheck will fail.

================
Comment at: llvm/test/Transforms/SLPVectorizer/AMDGPU/bswap.ll:10

 ; GFX8: call <2 x i16> @llvm.bswap.v2i16(
 define <2 x i16> @bswap_v2i16(<2 x i16> %arg) {
----------------
Regenerate + commit these files in trunk first - you will need to remove these manually added CHECK lines otherwise FileCheck will fail.

================
Comment at: llvm/test/Transforms/SLPVectorizer/AMDGPU/round-inseltpoison.ll:10

 ; GFX8: call <2 x half> @llvm.round.v2f16(
 define <2 x half> @round_v2f16(<2 x half> %arg) {
----------------
Regenerate + commit these files in trunk first - you will need to remove these manually added CHECK lines otherwise FileCheck will fail.

================
Comment at: llvm/test/Transforms/SLPVectorizer/AMDGPU/round.ll:10

 ; GFX8: call <2 x half> @llvm.round.v2f16(
 define <2 x half> @round_v2f16(<2 x half> %arg) {
----------------
Regenerate + commit these files in trunk first - you will need to remove these manually added CHECK lines otherwise FileCheck will fail.

================
Comment at: llvm/test/Transforms/SLPVectorizer/X86/hadd.ll:346
+; SSE-NEXT:    [[RV11:%.*]] = shufflevector <16 x i16> [[RV7]], <16 x i16> [[TMP10]], <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 16, i32 17, i32 18, i32 19, i32 undef, i32 undef, i32 undef, i32 undef>
+; SSE-NEXT:    [[RV15:%.*]] = shufflevector <16 x i16> [[RV11]], <16 x i16> [[TMP14]], <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 16, i32 17, i32 18, i32 19>
 ; SSE-NEXT:    ret <16 x i16> [[RV15]]
----------------
Regression - this is going to cause problems for HADD matching - ideally it'd use <8 x i16>

================
Comment at: llvm/test/Transforms/SLPVectorizer/X86/hsub-inseltpoison.ll:346
+; SSE-NEXT:    [[RV11:%.*]] = shufflevector <16 x i16> [[RV7]], <16 x i16> [[TMP10]], <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 16, i32 17, i32 18, i32 19, i32 undef, i32 undef, i32 undef, i32 undef>
+; SSE-NEXT:    [[RV15:%.*]] = shufflevector <16 x i16> [[RV11]], <16 x i16> [[TMP14]], <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 16, i32 17, i32 18, i32 19>
 ; SSE-NEXT:    ret <16 x i16> [[RV15]]
----------------
Regression

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98714/new/

https://reviews.llvm.org/D98714