[PATCH] D50665: [LV][LAA] Vectorize loop invariant values stored into loop invariant address

Thu Aug 23 12:08:33 PDT 2018

Ayal added inline comments.

================
Comment at: lib/Analysis/LoopAccessAnalysis.cpp:1871
+      return false;
+    return TheLoop->hasLoopInvariantOperands(cast<Instruction>(StoreVal));
+  };
----------------
anna wrote:
> Ayal wrote:
> > anna wrote:
> > > Ayal wrote:
> > > > Again, something LICM may have missed?
> > > yes, LICM misses this as well - see added test case in `inv_val_store_to_inv_address_conditional_inv`. 
> > Ahh, but a phi of invariant values is invariant iff the compares that decide which predecessor will reach the phi, are also invariant. In `inv_val_store_to_inv_address_conditional_inv` this holds because there `%cmp` determines which predecessor it'll be, and `%cmp` is invariant. In general Divergence Analysis is designed to provide this answer, as in D50433's `isUniform()`.
> yes, that's right. Note that this patch handles phis of invariant values based on either an invariant condition or a variant condition (see `inv_val_store_to_inv_address_conditional_diff_values_ic` where the phi result is based on a varying condition). 
> 
> The improved codegen and cost model handling is for predicated stores, where the block containing the invariant store is to be predicated. Today, we just handle this as a "predicated store" cost and generate the code gen accordingly. 
So the suggested `isLoopInvariantStoreValue` name is incorrect, as the store value may be variant. What's special about these variant values - why not handle any store value?

Yes, conditional stores to an invariant address will end up scalarized and predicated, i.e., with a branch-and-store per lane, which is quite inefficient. A masked scatter may work better there, until optimized by a single branch-and-store if any lane is masked-on (invariant stored value) or single branch-and-store of last masked-on lane (in/variant stored value).

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:5766
+  bool isLoopInvariantValueStored =
+      TheLoop->isLoopInvariant(SI->getValueOperand());
   return TTI.getAddressComputationCost(ValTy) +
----------------
Ayal wrote:
> Should be consistent and use the same `isLoopInvariantStoreValue()` noted above.
This is still inconsistent with the OR-operands-are-invariant above.

Repository:
  rL LLVM

https://reviews.llvm.org/D50665