[PATCH] D34601: [X86][LLVM]Expanding Supports lowerInterleavedStore() in X86InterleavedAccess.

Tue Jun 27 03:18:55 PDT 2017

dorit added a comment.

Just a few minor comments. Looks good to me otherwise, but should give @RKSimon a chance to see if he's happy with your responses (and maybe also another day to give @Farhana/@DavidKreitzer a chance to comment).

================
Comment at: lib/Target/X86/X86InterleavedAccess.cpp:105

   // Currently, lowering is supported for 4-element vectors of 64 bits on AVX.
   uint64_t ExpectedShuffleVecSize;
----------------
Should this comment be updated?

================
Comment at: lib/Target/X86/X86InterleavedAccess.cpp:180
+//  shufle(vec1,vec2,{NumElement/2,NumElement/2+1,NumElement/2+2...,NumElement-1,
+//                    NumElement-NumElement/2,NumElement-NumElement/2+1...,2*NumElement-1})
+//  = concat(high_half(VEC1),high_half(VEC2))
----------------
80-column overflow...

(and while you're fixing this, maybe use the same capitalization in naming of arguments (i.e. -  either "vec1" or "VEC1" form but not both)). 

================
Comment at: lib/Target/X86/X86InterleavedAccess.cpp:419
+// 1. stride4 x 64bit elements with vector factor 4 on AVX
+// 2. stride4 x 8bit elements with vector factor 32 on AVX2
 bool X86TargetLowering::lowerInterleavedStore(StoreInst *SI,
----------------
Have you tried enabling this also for AVX? (I understand if not, because with the current cost numbers that TTI returns for interleaved accesses on AVX we'll probably determine it's not worth vectorizing... so that may need to come along with an update of getInterleavedMemoryOpCost -- maybe at least a TODO comment is needed).

https://reviews.llvm.org/D34601