[PATCH] D21363: Strided Memory Access Vectorization
Ashutosh Nema via llvm-commits
llvm-commits at lists.llvm.org
Mon Jun 20 04:50:10 PDT 2016
ashutosh.nema added a comment.
Thanks Elena for looking into this RFC.
================
Comment at: lib/Target/X86/X86ISelLowering.h:686
@@ +685,3 @@
+ /// Returns maximum supported Interleave factor.
+ unsigned getMaxSupportedInterleaveFactor() const override { return 4; }
+
----------------
delena wrote:
> Why 4?
This may not be required, ideally we should not put any limit. Things should be driven by costing.
Will check and remove this.
================
Comment at: lib/Target/X86/X86TargetTransformInfo.cpp:50
@@ +49,3 @@
+/// Offset 0, 3, 6, 9 required to fill vector register.
+/// So 2 vector load will be requied.
+/// NOTE: It assumes all iteration for a given stride holds common memory
----------------
delena wrote:
> But it depends on element size..
I did not understood this comment completely.
Are you pointing where vectorizer can go above the target supported width ?
I.e.
double foo(double *A, double *B, int n) {
double sum = 0;
#pragma clang loop vectorize_width(16)
for (int i = 0; i < n; ++i)
sum += A[i] + 5;
return sum;
}
================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:1004
@@ +1003,3 @@
+/// i.e. arr[i*X], arr[(i+1) * X], arr[(i*X)+1] (where X is a constant stride)
+void StrideAccessInfo::analyzeStride(const Instruction *I) {
+ const LoadInst *Ld = dyn_cast<const LoadInst>(I);
----------------
delena wrote:
> getPtrStride() does something similar
Will check it.
Repository:
rL LLVM
http://reviews.llvm.org/D21363
More information about the llvm-commits
mailing list