[PATCH] D26905: [SLP] Vectorize loads of consecutive memory accesses, accessed in non-consecutive (jumbled) way.
Michael Kuperstein via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Jan 27 10:53:38 PST 2017
mkuper accepted this revision.
mkuper added a comment.
This revision is now accepted and ready to land.
LGTM, Thanks!
================
Comment at: test/Transforms/SLPVectorizer/X86/store-jumbled.ll:14
+; CHECK-NEXT: [[TMP2:%.*]] = load <4 x i32>, <4 x i32>* [[TMP1]], align 4
+; CHECK-NEXT: [[TMP3:%.*]] = shufflevector <4 x i32> [[TMP2]], <4 x i32> undef, <4 x i32> <i32 1, i32 3, i32 0, i32 2>
; CHECK-NEXT: [[INN_ADDR:%.*]] = getelementptr inbounds i32, i32* [[INN:%.*]], i64 0
----------------
Ok, so this is pretty much what I thought will happen. We shuffle both loads the same way, and then multiply, instead of multiplying and then shuffling. But this is probably fine - I hope InstCombine will pick up on this and combine it to a mul followed by a shuffle, if the masks match.
https://reviews.llvm.org/D26905
More information about the llvm-commits
mailing list