[PATCH] D37648: [SLPVectorizer] Fix PR21780 Expansion of 256 bit vector loads fails to fold into shuffles
Philip Reames via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Sep 26 15:31:23 PDT 2017
reames added inline comments.
================
Comment at: test/Transforms/SLPVectorizer/X86/pr21780.ll:22
+;
+ call void @llvm.phantom.mem.p0f64.p0f64(double* %ptr, double* null, i64 3)
+ %arrayidx2 = getelementptr inbounds double, double* %ptr, i64 2
----------------
I think something might be missing here. You're forming a 4x wide load, but you've only proven dereferenceability for offsets 0, 1, 3. (i.e. not 2). How do we know it's safe to dereference between the two elements 1 & 3?
https://reviews.llvm.org/D37648
More information about the llvm-commits
mailing list