[PATCH] D26083: [LV] Scalarize operands of predicated instructions
Gil Rapaport via llvm-commits
llvm-commits at lists.llvm.org
Mon Nov 7 08:14:45 PST 2016
gilr added inline comments.
================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:4678
isa<DbgInfoIntrinsic>(&I)) &&
- Legal->isScalarAfterVectorization(&I)) {
- scalarizeInstruction(&I);
+ (Legal->isScalarAfterVectorization(&I) ||
+ Cost->isProfitableToScalarize(&I, VF))) {
----------------
mkuper wrote:
> We have 5 places that call isScalarAfterVectorization().
> Is this the only call site that cares about this? (If all of them should care, perhaps wrap in a helper function?)
I think Michael's comment raises a more general question: should it matter why we scalarize?
Put differently, once an instruction is marked for scalarization by legal or cost, shouldn't we treat it uniformly in the code?
[if so, then the question "will the instruction be scalarized" would only be well-defined in the context of some VF]
================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:6583
+ DenseMap<Instruction *, unsigned> ScalarCosts;
+ if (computePredInstDiscount(&I, ScalarCosts, VF) >= 0)
+ InstsToScalarizeVF.insert(ScalarCosts.begin(), ScalarCosts.end());
----------------
mkuper wrote:
> Just trying to understand if I got this right.
> Let's say we have:
>
> ```
> %a = add i32, ...
> %r = udiv i32 %x, %a
> %s = udiv i32 %y, %a
> %t = udiv i32 %z, %a
> ```
> In the same block.
> Will we do the right thing evaluating the cost of scalarizing the add in a separate context for each udiv?
Actually in such a case it seems the addition gets scalarized but cannot be sunk down because of the multiple uses.
I also wonder if we calculate the cost correctly when one predicated instruction feeds another.
================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:6605
+ Worklist.push_back(PredInst);
+ while (!Worklist.empty()) {
+ Instruction *I = Worklist.pop_back_val();
----------------
The following loop causes the patch to assert
void foo(int* restrict a, int b, int* restrict c) {
for (int i = 0; i < 10000; ++i) {
if (a[i] > 777) {
int t2 = c[i];
int t3 = t2 / b;
a[i] += t3;
}
}
}
while trying to scalarize a uniform value (the predicated load).
Got this example to work by skipping I if any of its operands isUniformAfterVectorization().
================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:6624
+ if (Legal->isScalarWithPredication(I) && !I->getType()->isVoidTy())
+ ScalarCost += getScalarizationOverhead(ToVectorTy(I->getType(), VF), true,
+ false, TTI);
----------------
Wasn't the cost of insert-element accounted for by VectorCost?
https://reviews.llvm.org/D26083
More information about the llvm-commits
mailing list