[PATCH] D21363: Strided Memory Access Vectorization

Ashutosh Nema via llvm-commits llvm-commits at lists.llvm.org
Mon Jun 20 04:50:10 PDT 2016


ashutosh.nema added a comment.

Thanks Elena for looking into this RFC.


================
Comment at: lib/Target/X86/X86ISelLowering.h:686
@@ +685,3 @@
+    /// Returns maximum supported Interleave factor.
+    unsigned getMaxSupportedInterleaveFactor() const override { return 4; }
+
----------------
delena wrote:
> Why 4?
This may not be required, ideally we should not put any limit. Things should be driven by costing.
Will check and remove this.

================
Comment at: lib/Target/X86/X86TargetTransformInfo.cpp:50
@@ +49,3 @@
+/// Offset 0, 3, 6, 9 required to fill vector register.
+/// So 2 vector load will be requied.
+/// NOTE: It assumes all iteration for a given stride holds common memory
----------------
delena wrote:
> But it depends on element size..
I did not understood this comment completely.

Are you pointing where vectorizer can go above the target supported width ?
I.e.
double foo(double *A, double *B, int n) {
  double sum = 0;
#pragma clang loop vectorize_width(16)
  for (int i = 0; i < n; ++i)
    sum += A[i] + 5;
  return sum;
}

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:1004
@@ +1003,3 @@
+/// i.e. arr[i*X], arr[(i+1) * X], arr[(i*X)+1] (where X is a constant stride)
+void StrideAccessInfo::analyzeStride(const Instruction *I) {
+  const LoadInst *Ld = dyn_cast<const LoadInst>(I);
----------------
delena wrote:
> getPtrStride() does something similar
Will check it.


Repository:
  rL LLVM

http://reviews.llvm.org/D21363





More information about the llvm-commits mailing list