[PATCH] D34601: [X86][LLVM]Expanding Supports lowerInterleavedStore() in X86InterleavedAccess.

Sun Jun 25 02:15:27 PDT 2017

RKSimon added a comment.

Would it be beneficial to work on a more general solution for the 128-bit subvector issue? Won't 16x16 and 8x32 (as well as all the 512-bit equivalents) still suffer?

================
Comment at: lib/Target/X86/X86InterleavedAccess.cpp:367
   //   3. Concatenate the contiguous-vectors back into a wide vector.
   Value *WideVec = concatenateVectors(Builder, TransposedVectors);

----------------
Indenting / clang-format

================
Comment at: lib/Target/X86/X86InterleavedAccess.cpp:381
+        VectorType::get(Type::getInt16Ty(Shuffles[0]->getContext()), 16)
+            ->getPointerTo();
+    Value *VecBasePtr = Builder.CreateBitCast(VecInst, VecTran);
----------------
If you pull out the Type::getInt16Ty(Shuffles[0]->getContext()) you should be able to tidy all this up

================
Comment at: test/CodeGen/X86/x86-interleaved-access.ll:143
+
+define void @interleaved_store_vf32_i8_stride4(<32 x i8> %x1, <32 x i8> %x2, <32 x i8> %x3, <32 x i8> %x4, <128 x i8>* %p) {
+; AVX2-LABEL: interleaved_store_vf32_i8_stride4:
----------------
Add this test to trunk with current codegen so this patch shows the diff.

================
Comment at: test/Transforms/InterleavedAccess/X86/interleavedStore.ll:1
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt < %s -mtriple=x86_64-pc-linux -mattr=+avx -mattr=+avx2 -interleaved-access -S | FileCheck %s
----------------
Add this file to trunk with current codegen so this patch shows the diff.

================
Comment at: test/Transforms/InterleavedAccess/X86/interleavedStore.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt < %s -mtriple=x86_64-pc-linux -mattr=+avx -mattr=+avx2 -interleaved-access -S | FileCheck %s
+
----------------
You just need -mattr=+avx2 - it implies -mattr=+avx

https://reviews.llvm.org/D34601