[PATCH] D50665: [LV][LAA] Vectorize loop invariant values stored into loop invariant address

Wed Sep 12 05:27:11 PDT 2018

Ayal added a comment.

Best allow only a single store to an invariant address for now; until we're sure the last one to store is always identified correctly.

================
Comment at: include/llvm/Analysis/LoopAccessAnalysis.h:568
+  /// Checks existence of stores to invariant address inside loop.
+  /// If such stores exist, checks if those are stores of variant values.
+  bool hasVariantStoreToLoopInvariantAddress() const {
----------------
```
/// Checks existence of stores to invariant address inside loop.
/// If such stores exist, checks if those are stores of variant values.
```
can be updated and simplified into something like

```
/// If the loop has any store of a variant value to an invariant address, then return true, else return false.
```

================
Comment at: lib/Analysis/LoopAccessAnalysis.cpp:1869
+    HasVariantStoreToLoopInvariantAddress |=
+        (isUniform(Ptr) && !TheLoop->isLoopInvariant(ST->getValueOperand()));
+
----------------
How about `isUniform(Ptr) && !isUniform(ST->getValueOperand())` ?
Relying more consistently on SCEV to determine invariance of both address and stored value. Is there a reason for treating stored value more conservatively, checking its invariance by asking if it's outside the loop?

================
Comment at: lib/Transforms/Vectorize/LoopVectorizationLegality.cpp:780
     ORE->emit(createMissedAnalysis("CantVectorizeStoreToLoopInvariantAddress")
               << "write to a loop invariant address could not be vectorized");
     LLVM_DEBUG(dbgs() << "LV: We don't allow storing to uniform addresses\n");
----------------
update the message as well: "write **of variant value** to a loop invariant address ..."

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:5890
+  bool isLoopInvariantStoreValue =
+      TheLoop->isLoopInvariant(SI->getValueOperand());
   return TTI.getAddressComputationCost(ValTy) +
----------------
Complementing the consistent use of isUniform rather than isLoopInvariant:
`bool isLoopInvariantStoreValue = Legal->isUniform(SI->getValueOperand());` ?
, similar to the way the address is checked to be uniform before calling this method below.

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:6008
+        // TODO: Replicated loads and stores should be removed instead of
+        // relying on instcombine to remove the redundant loads and stores.
+        // Load: Scalar load + broadcast
----------------
Comment can be simplified to something like
        // TODO: Avoid replicating loads and stores instead of
        // relying on instcombine to remove them.

================
Comment at: test/Transforms/LoopVectorize/X86/invariant-store-vectorization.ll:16
+; CHECK:         store i32 %ntrunc, i32* %a
+; CHECK-NOT:     store i32  %ntrunc, i32* %a
+; CHECK:         %index.next = add i64 %index, 64
----------------
Have one space instead of two between i32 and %ntrunc on the check-not'd store. Easier to see that this checks for a single copy of the store, i.e., that instcombine eliminated all redundant copies. May want to comment what this test is designed to check.

================
Comment at: test/Transforms/LoopVectorize/X86/invariant-store-vectorization.ll:131
+}
+define void @inv_val_load_to_inv_address_conditional(i32* %a, i64 %n, i32* %b, i32 %k) {
+; CHECK-LABEL: @inv_val_load_to_inv_address_conditional(
----------------
inv_val_load_to?

================
Comment at: test/Transforms/LoopVectorize/invariant-store-vectorization.ll:11
+
+; all tests that check whether it is legal to vectorize the stores to invariant
+; address.
----------------
"that check whether" >> "check that"
("whether" usually comes with an "or not")

================
Comment at: test/Transforms/LoopVectorize/invariant-store-vectorization.ll:80
+; Both of these tests below are handled as predicated stores and have the cost model
+; as identifying these as predicated stores.
+
----------------
"as identifying these" >> "identify them"
Do we check what the cost model identifies?

================
Comment at: test/Transforms/LoopVectorize/invariant-store-vectorization.ll:132
+; For this case, we still vectorize, by generating predicated stores for the if
+; and else cases.
+; TODO: Code gen can be improved by select(extract(vec_cmp(b[i], k), VF - 1) == 1, a = ntrunc, a = k)
----------------
Hmm, multiple stores to the same invariant address did not trigger LAI memory dependence checks(?)
This may generate wrong code if the conditional scalarized stores are emitted in the wrong order, or if a pair of masked scatters are used.

================
Comment at: test/Transforms/LoopVectorize/invariant-store-vectorization.ll:151
+; CHECK-LABEL: pred.store.continue14:
+; CHECK:         [[EE1:%[a-zA-Z0-9.]+]] = extractelement <4 x i1> [[CMP]], i32 0
+define void @inv_val_store_to_inv_address_conditional_diff_values(i32* %a, i64 %n, i32* %b, i32 %k) {
----------------
good to continue CHECKing that EE1 is used in a branch that guards a store of %ntrunc to %a.

================
Comment at: test/Transforms/LoopVectorize/invariant-store-vectorization.ll:182
+
+; Instcombine'd version of above test. Now the store is no longer predicated.
+; TODO: We should be able to vectorize this loop.
----------------
Now the store/s is/are no longer of invariant value/s.

================
Comment at: test/Transforms/LoopVectorize/invariant-store-vectorization.ll:183
+; Instcombine'd version of above test. Now the store is no longer predicated.
+; TODO: We should be able to vectorize this loop.
+; CHECK-LABEL: inv_val_store_to_inv_address_conditional_diff_values_ic
----------------
.. once we support vectorizing stores of variant values to invariant addresses

================
Comment at: test/Transforms/LoopVectorize/invariant-store-vectorization.ll:219
+; is the latch block (which doesn't need to be predicated).
+; TODO: We should vectorize this loop.
+; CHECK-LABEL: inv_val_store_to_inv_address_conditional_inv
----------------
.. efficiently once divergence analysis identifies storeval as uniform

================
Comment at: test/Transforms/LoopVectorize/invariant-store-vectorization.ll:252
+
+; TODO: This loop can be vectorized even though it's a variant value being
+; stored into invariant address.
----------------
"even though it's" >> "once we support"

================
Comment at: test/Transforms/LoopVectorize/pr31190.ll:33
+; We can vectorize this loop because we are storing an invariant value into an
+; invariant address.
 ; CHECK-LABEL: @test
----------------
CHECK vectorized code emitted, or debug info stating it can be vectorized?

Repository:
  rL LLVM

https://reviews.llvm.org/D50665