[PATCH] D37648: [SLPVectorizer] Fix PR21780 Expansion of 256 bit vector loads fails to fold into shuffles

Philip Reames via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Sep 26 15:31:23 PDT 2017


reames added inline comments.


================
Comment at: test/Transforms/SLPVectorizer/X86/pr21780.ll:22
+;
+  call void @llvm.phantom.mem.p0f64.p0f64(double* %ptr, double* null, i64 3)
+  %arrayidx2 = getelementptr inbounds double, double* %ptr, i64 2
----------------
I think something might be missing here.  You're forming a 4x wide load, but you've only proven dereferenceability for offsets 0, 1, 3.  (i.e. not 2).  How do we know it's safe to dereference between the two elements 1 & 3?


https://reviews.llvm.org/D37648





More information about the llvm-commits mailing list