[PATCH] D98714: [SLP] Add insertelement instructions to vectorizable tree

Anton Afanasyev via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon May 10 12:12:00 PDT 2021


anton-afanasyev marked 4 inline comments as done.
anton-afanasyev added inline comments.


================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:4215-4217
+    // No extract cost for vector "scalar"
+    if (isa<FixedVectorType>(EU.Scalar->getType()))
+      continue;
----------------
ABataev wrote:
> I don't think this is correct. I think you need to use code from my patch D101555 for better cost estimation here.
Why? We don't generate any extract instructions for external using of vector, so no need to cost it.

I think you mixed two different cases up.
The patch D101555 you referenced is about cost estimation when insert is _user_, but here is the case when insert is _used_. We do not really need to "extract" it, since its user uses vector value rather than scalar one.

Also I don't think we need to use code from D101555 in this patch, since it does the same by the other way. The main idea of this patch is to unify the way we process inserts (the only vector tree node for now) vs ordinary tree nodes. The inserts are tree node now, and they are sorted by index, so no need to shuffle them. We could use `ReorderIndicies` if needed, but no need, since operands are sorted as well.


================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:5102
     Value *Lane = Builder.getInt32(ExternalUse.Lane);
+    auto extractAndExtendIfNeeded = [&](Value *Vec) {
+      if (Scalar->getType() != Vec->getType()) {
----------------
ABataev wrote:
> `ExtractAndExtendIfNeeded`
Sure, done.


================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:2883
+
+      if (IsConsecutive) {
+        TreeEntry *TE = newTreeEntry(VL, Bundle /*vectorized*/, S, UserTreeIdx);
----------------
ABataev wrote:
> anton-afanasyev wrote:
> > ABataev wrote:
> > > If not consecutive, we can include the cost of single permute and still vectorize it, no?
> > Inserts here are coming from `findBuildAggregate()`, being already sorted by index and therefore consecutive, the only exclusion is that they can have gaps, so we check it here. I don't think we should process this rare case within this patch.
> Can we check for the gaps during the analysis of the build vector sequence instead?
I've decided to check it here since we can get other sources of incoming insert bundle in future.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98714/new/

https://reviews.llvm.org/D98714



More information about the llvm-commits mailing list