[PATCH] D9804: Optimize scattered vector insert/extract pattern

Tue Sep 22 16:06:59 PDT 2015

mzolotukhin added a subscriber: mzolotukhin.
mzolotukhin added a comment.

Hi Lawrence,

I haven't looked into this patch in details, but I have a couple of suggestions that would help further review:

1. upload the patch with full context
2. separate independent parts into different patches (e.g. adding ADCE pass after SLP is totally independent on the new stuff you implemented in SLP)
3. Please describe the case you're working on in IR terms, not asm. SLP operates on IR level, so it's easier to grasp what transformation you're seeking if we use IR.

Thanks,
Michael

================
Comment at: lib/Transforms/Vectorize/SLPVectorizer.cpp:4346
@@ +4345,3 @@
+  unsigned Count = 0;
+  DenseMap<Instruction *, bool> Visited;
+  InsertElems.clear();
----------------
`SmallPtrSet` could be used here instead.

================
Comment at: test/Transforms/SLPVectorizer/AArch64/combine-insertelement.ll:1
@@ +1,2 @@
+; RUN: opt -S -slp-vectorizer -slp-vectorize-gather -adce  %s | FileCheck %s
+target datalayout = "e-m:e-i64:64-i128:128-n32:64-S128"
----------------
The new functionality in SLP should be tested independently on other passes. If you're also interested in outcome of subsequent ADCE, then you might want to add *another* test for ADCE (the output of SLP would be the input for ADCE).

================
Comment at: test/Transforms/SLPVectorizer/AArch64/combine-insertelement.ll:45-47
@@ +44,5 @@
+
+define <8 x i16>  @test2(i8* %arrayidx1, i8* %arrayidx2, i8* %arrayidx3,
+; CHECK-LABEL: @test2
+  i8* %arrayidx4, i8* %arrayidx5, i8* %arrayidx6, i8* %arrayidx7, i8* %arrayidx8, <8 x i16> %x)  {
+; CHECK:      %9 = insertelement <8 x i8> undef, i8 %1, i32 0
----------------
I believe that wrapped line would be a syntax error.

================
Comment at: test/Transforms/SLPVectorizer/AArch64/combine-insertelement.ll:86-87
@@ +85,4 @@
+
+; CHECK-LABEL: @test3
+  i8* %arrayidx4, i8* %arrayidx5, i8* %arrayidx6, i8* %arrayidx7, i8* %arrayidx8)  {
+; CHECK:      %9 = insertelement <8 x i8> undef, i8 %1, i32 0
----------------
Some line is missing here.

================
Comment at: test/Transforms/SLPVectorizer/AArch64/combine-insertelement.ll:165
@@ +164,3 @@
+
+; Kept here for future extension
+define <8 x i16>  @test5(i8* %arrayidx1, i8* %arrayidx2)  {
----------------
No reason to add this now - when in future you submit another patch with the extension, you'll be asked to add a testcase.

Repository:
  rL LLVM

http://reviews.llvm.org/D9804