[PATCH] D9804: Optimize scattered vector insert/extract pattern

Fri Sep 25 14:40:05 PDT 2015

hulx2000 marked 3 inline comments as done.
hulx2000 added a comment.

Just saw comments from Hal and Nadav.

For Hal's comments:

1. If the original ext is used more than once, then the original ext can't be deleted after my transformation, so it may not gain anything, that's why I check hasOneUse() on it.
2. I agree, this transformation is designed for AArch64, so I could make it AArch64 specific.

For Navav's comment "We are already doing these kind of optimizations in SelectionDAG. The SLPVectorizer is not the right place for this kind of transformation", do you mean I shouldn't do this (my) transformation in SLPVectorizer?  At least for our case, SelectionDAG is unable to catch it, and it caused a performance loss.

For the rest of coding comments, I will address it with another patch update.

Thanks

================
Comment at: lib/Transforms/Vectorize/SLPVectorizer.cpp:68
@@ +67,3 @@
+SLPScatter("slp-vectorize-scatter", cl::init(false), cl::Hidden,
+           cl::desc("Attempt to vectorize extract vector sequence"));
+
----------------
nadav wrote:
> Why do we need two flags for insert and extract?  Do you feel like this feature is experimental?
> 
> Did you run some performance measurements on the llvm test suite?  Are you seeing any wins?
I can remove that two flags.

I did measure our internal benchmark, I did see wins, will run performance measurement on llvm test suite.

Repository:
  rL LLVM

http://reviews.llvm.org/D9804