[PATCH] D66688: [LoopVectorize] Leverage speculation safety to avoid masked.loads

Sun Sep 8 11:22:40 PDT 2019

Ayal added a comment.

This LGTM, just some minor nits.

================
Comment at: lib/Transforms/Vectorize/LoopVectorizationLegality.cpp:20
 #include "llvm/Analysis/VectorUtils.h"
+#include "llvm/Analysis/ValueTracking.h"
 #include "llvm/IR/IntrinsicInst.h"
----------------
Ayal wrote:
> nit: lex ordering (the above admittedly sets an unordered example..)
ValueTracking.h < VectorUtils.h

================
Comment at: lib/Transforms/Vectorize/LoopVectorizationLegality.cpp:918

+/// Return true if we can that the given load would access only dereferenceable
+/// memory, and be properly aligned on every iteration.  (i.e. does not require
----------------
/// Return true if we can [prove] that the given

================
Comment at: lib/Transforms/Vectorize/LoopVectorizationLegality.cpp:928
+  auto *AddRec = dyn_cast<SCEVAddRecExpr>(SE.getSCEV(Ptr));
+  if (!AddRec || AddRec->getLoop() != L || !AddRec->isAffine())
+    return false;
----------------
What about loads from uniform addresses? If desired their unmasking can be delayed to a later patch, worth leaving a TODO.

================
Comment at: lib/Transforms/Vectorize/LoopVectorizationLegality.cpp:935
+  APInt LoadSize(DL.getIndexTypeSizeInBits(Ptr->getType()),
+                   DL.getTypeStoreSize(LI->getType()));
+  // TODO: generalize to access patterns which don't touch every byte
----------------
indent

"LoadSize"? This refers to the memory size of a single element.

================
Comment at: lib/Transforms/Vectorize/LoopVectorizationLegality.cpp:936
+                   DL.getTypeStoreSize(LI->getType()));
+  // TODO: generalize to access patterns which don't touch every byte
+  if (StepC != LoadSize)
----------------
don't touch every byte >> have gaps

================
Comment at: lib/Transforms/Vectorize/LoopVectorizationLegality.cpp:942
+  if (!TC)
+    return false;
+
----------------
It would be great to handle symbolic trip counts known to be sufficiently small.

================
Comment at: lib/Transforms/Vectorize/LoopVectorizationLegality.cpp:945
+  APInt Size(DL.getIndexTypeSizeInBits(Ptr->getType()),
+             TC*DL.getTypeStoreSize(LI->getType()));
+
----------------
"Size"? This refers to the entire memory interval accessed by all iterations of the loop.

Set Size = LoadSize * TC?

Handling gaps would imply (only?) setting Size accordingly, right? That may be useful for unmasking gathers / interleave-groups.

================
Comment at: lib/Transforms/Vectorize/LoopVectorizationLegality.cpp:948
+  auto *StartS = dyn_cast<SCEVUnknown>(AddRec->getStart());
+  if (!StartS || !SE.isLoopInvariant(StartS, L))
+    return false;
----------------
StartS must be LoopInvariant, iiuc. ("All operands of an AddRec are required to be loop invariant.")

This method is currently employed only by the innermost loop vectorizer; if/when considered by outer loop vectorization, there will be more than a single loop to look after; i.e., StartS may also be an AddRec to check.

================
Comment at: lib/Transforms/Vectorize/LoopVectorizationLegality.cpp:957
+    Align = DL.getABITypeAlignment(LI->getType());
+  // For the moment, restrict ourselves to the case where the access size is a
+  // multiple of the requested alignment and the base is aligned.  
----------------
For the moment - as in TODO?

the access size - of a single element

Do we (need to) check that "the base is aligned"?

================
Comment at: lib/Transforms/Vectorize/LoopVectorizationLegality.cpp:968
+/// must be proven.
+/// TODO: Move to ValueTracking.h/cpp in a separate change
+static bool mustSuppressSpeculatation(const LoadInst &LI) {
----------------
Can first refactor ValueTracking.h/cpp's isSafeToSpeculativelyExecute().

================
Comment at: lib/Transforms/Vectorize/LoopVectorizationLegality.cpp:969
+/// TODO: Move to ValueTracking.h/cpp in a separate change
+static bool mustSuppressSpeculatation(const LoadInst &LI) {
+  if (!LI.isUnordered())
----------------
mustSuppressSpecula[ta]tion

================
Comment at: lib/Transforms/Vectorize/LoopVectorizationLegality.cpp:1007

-    for (Instruction &I : *BB)
-      if (auto *Ptr = getLoadStorePointerOperand(&I))
-        SafePointes.insert(Ptr);
+    // For a block which requires predication, a address may be safe to access
+    // in the loop w/o predication if we can prove dereferenceability facts
----------------
a[n] address

================
Comment at: lib/Transforms/Vectorize/LoopVectorizationLegality.cpp:1009
+    // in the loop w/o predication if we can prove dereferenceability facts
+    // sufficient to ensure it'll never fault within the loop. For the moment,
+    // we restrict this to loads; stores are more complicated due to
----------------
For the moment - as in TODO?

================
Comment at: lib/Transforms/Vectorize/LoopVectorizationLegality.cpp:1020
+        continue;
+      SafePointes.insert(LI->getPointerOperand());
+    }
----------------
If DL is left to be computed by callee rather than caller, can simplify into:

```
  if (LI && !mustSuppressSpeculation(*LI) &&
      isDereferenceableAndAlignedInLoop(LI, TheLoop, SE, *DT))
    SafePointes.insert(LI->getPointerOperand());
```

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66688/new/

https://reviews.llvm.org/D66688