[PATCH] D59995: [LV] Exclude loop-invariant inputs from scalar cost computation.

Tue May 14 15:07:17 PDT 2019

Ayal added a comment.

In D59995#1493241 <https://reviews.llvm.org/D59995#1493241>, @fhahn wrote:

> In D59995#1455952 <https://reviews.llvm.org/D59995#1455952>, @Ayal wrote:
>
> > The culprit here is the assumption made by `TTI.getOperandsScalarizationOverhead(Operands, VF)` that all its Operands will be vectorized according to VF, and would thus require extraction to feed a scalarized/replicated user. But any such Operand might not get vectorized, and possibly must not get vectorized, e.g., due to an incompatible type as demonstrated by PR41294 and the testcase. In some cases an Operand will obviously not be vectorized, such as if it's loop-invariant or live-in. More generally, LV uses the following:
> >
> >   auto needsExtract = [&](Instruction *I) -> bool {
> >     return TheLoop->contains(I) && !isScalarAfterVectorization(I, VF);
> >   };
> >   
> >
> > which would require passing not only `TheLoop` into `getScalarizationOverhead(I, VF, TTI)` but also the CM --- better turn this static function into a method of CM?
>
>
> Done.

Very good, thanks for the refactoring.

> 
> 
>> Note that there's also `CM.isProfitableToScalarize(I, VF))`, but it relies on having computed costs, so difficult to use when computing the cost (of a User). Skipping it would only affect accuracy of resulting cost, considering Operands that can be vectorized but will not be due to profitability.
>> 
>> Fixing `getVectorCallCost` deserves another testcase.
> 
> Added a test case with call. But thinking about it again, it does not really test the issue. Not sure it is actually possible to test getVectorCallCost, as there are no vector call functions that take struct values?

Good point. Does the test case pass w/o this patch?

>> Seems like `getVectorIntrinsicCost` also requires fixing?
> 
> Yep, I'll update it in a second. In practice, I don't think there are any intrinsics that take struct types, but maybe in the future it might become a problem.
> 
>> Would be good to hoist such invariant instructions
>> 
>>   %a = extractvalue { i64, i64 } %sv, 0
>>   %b = extractvalue { i64, i64 } %sv, 1
>>    
>> 
>> out of the loop before LV, or at-least have LV recognize them as uniform.
> 
> Yep, I can look into that as a follow up. LICM should hoist those things, but I think in general we cannot guarantee that all loop-invariant instructions are hoisted out before LoopVectorize (the test case came from a fuzzer). Do you think we should try to hoist them in LV? I would assume a later run of LICM will hoist them.

The thought was to check LICM, and check LV's uniform analysis.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1335

+  /// Returns true if an \p V that cannot be scalarized must be extracted
+  /// from a vector. If \p AssumeNotScalarAfterVec is true, skip
----------------
This is admittedly part of the original comment, but while we're here - "cannot be scalarized" >> "is expected to be vectorized" - every V can be scalarized. Furthermore, if we're unsure whether V will be vectorized or not it's safer to assume that it will be scalarized rather than vectorized, as the latter might not be possible. So if Scalars.find(VF) isn't set yet, which will cause isScalarAfterVectorization() to assert, better have needsExtract() return false.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5696
+    // this function is called for them via setCostBasedWideningDecision, before
+    // scalars have been collected.
+    Cost += TTI.getOperandsScalarizationOverhead(
----------------
Yes, there's a phase ordering issue here, and also one regarding isProfitableToScalarize() as mentioned above (which would be good to note somewhere). But the instruction(s) that may trigger an assert, being loads or stores, are the operands of I rather than I itself, right? In any case, suggest to have FilterOperands()/needsExtract() understand if they can use isScalarAfterVectorization() or else be conservative, as noted above. Perhaps make Filter[Extracting]Operands() a method, to be reused.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59995/new/

https://reviews.llvm.org/D59995