[PATCH] D28959: [SLP] Improve horizontal vectorization for non-power-of-2 number of instructions.
Alexey Bataev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jan 23 03:29:11 PST 2017
ABataev marked 2 inline comments as done.
ABataev added inline comments.
================
Comment at: lib/Transforms/Vectorize/SLPVectorizer.cpp:4275
Builder.setFastMathFlags(Unsafe);
- unsigned i = 0;
+ unsigned I = 0;
----------------
mkuper wrote:
> As long as you're changing this, maybe name this "Idx" (or leave it as "i" - this is against convention in the strict sense, but that's one of the common exceptions). But I'm also ok with "I" if you think it's better.
>
> Also, this is exactly an example of what I was talking about on the other patch - you could commit the name change separately, as a preparatory patch, and then this patch would only include functional changes, which makes it much easier to review.
Reverted it back. clang-tidy made this change.
================
Comment at: lib/Transforms/Vectorize/SLPVectorizer.cpp:4291
// Estimate cost.
- int Cost = V.getTreeCost() + getReductionCost(TTI, ReducedVals[i]);
+ int Cost = V.getTreeCost() + getReductionCost(TTI, VL[0]);
if (Cost >= -SLPCostThreshold)
----------------
mkuper wrote:
> Can you explain why you changed ReducedVals[i] to VL[0] here and below?
> I'm not sure whether this is a correct change or not, but it seems orthogonal to this patch.
Because `ReducedVals[i]` actually is the same as `VL[0]`. Take a look at line 4278. `VL` ArrayRef starts from `ReducedVals[i]`. I had to make this change because changed the logic of updating of `i` counter. But anyway, reworked this part a little bit.
================
Comment at: lib/Transforms/Vectorize/SLPVectorizer.cpp:4310
VectorizedTree = ReducedSubTree;
+ ReduxWidth = std::max<unsigned>(4, PowerOf2Floor(NumReducedVals - I));
}
----------------
mkuper wrote:
> This whole ReduxWidth stuff is fishy.
> We use it internally in matchAssociativeReduction(), and then overwrite it just before calling tryToReduce(). Why does tryToReduce use this member variable, instead of having a local (and not having this silliness overwriting it just before calling tryToReduce())?
>
> If this makes sense, then changing this can be another preparatory NFC patch before this one. If not, then why not?
Ok, I'll try to prepare the NFC patch for this member
================
Comment at: test/Transforms/SLPVectorizer/X86/horizontal-list.ll:572
}
----------------
mkuper wrote:
> Could you add a test for something like width 31, so that we get 16 + 8 + 4 + 3?
Ok, will do
https://reviews.llvm.org/D28959
More information about the llvm-commits
mailing list