[PATCH] D59995: [LV] Exclude loop-invariant inputs from scalar cost computation.

Tue May 7 04:08:17 PDT 2019

fhahn marked 3 inline comments as done.
fhahn added a comment.

In D59995#1455952 <https://reviews.llvm.org/D59995#1455952>, @Ayal wrote:

> The culprit here is the assumption made by `TTI.getOperandsScalarizationOverhead(Operands, VF)` that all its Operands will be vectorized according to VF, and would thus require extraction to feed a scalarized/replicated user. But any such Operand might not get vectorized, and possibly must not get vectorized, e.g., due to an incompatible type as demonstrated by PR41294 and the testcase. In some cases an Operand will obviously not be vectorized, such as if it's loop-invariant or live-in. More generally, LV uses the following:
>
>   auto needsExtract = [&](Instruction *I) -> bool {
>     return TheLoop->contains(I) && !isScalarAfterVectorization(I, VF);
>   };
>   
>
> which would require passing not only `TheLoop` into `getScalarizationOverhead(I, VF, TTI)` but also the CM --- better turn this static function into a method of CM?

Done.

> Note that there's also `CM.isProfitableToScalarize(I, VF))`, but it relies on having computed costs, so difficult to use when computing the cost (of a User). Skipping it would only affect accuracy of resulting cost, considering Operands that can be vectorized but will not be due to profitability.
> 
> Fixing `getVectorCallCost` deserves another testcase.

Added a test case with call. But thinking about it again, it does not really test the issue. Not sure it is actually possible to test getVectorCallCost, as there are no vector call functions that take struct values?

> Seems like `getVectorIntrinsicCost` also requires fixing?

Yep, I'll update it in a second. In practice, I don't think there are any intrinsics that take struct types, but maybe in the future it might become a problem.

> Would be good to hoist such invariant instructions
> 
>   %a = extractvalue { i64, i64 } %sv, 0
>   %b = extractvalue { i64, i64 } %sv, 1
>    
> 
> out of the loop before LV, or at-least have LV recognize them as uniform.

Yep, I can look into that as a follow up. LICM should hoist those things, but I think in general we cannot guarantee that all loop-invariant instructions are hoisted out before LoopVectorize (the test case came from a fuzzer). Do you think we should try to hoist them in LV? I would assume a later run of LICM will hoist them.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59995/new/

https://reviews.llvm.org/D59995