[PATCH] D37687: [X86][LLVM]Expanding Supports lowerInterleaved{store|load}() in X86InterleavedAccess (VF64 stride 3)
Zvi Rackover via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Sep 28 08:54:29 PDT 2017
zvi added inline comments.
================
Comment at: lib/Target/X86/X86InterleavedAccess.cpp:83
unsigned NumSubVecElems);
+ void concatSubVector(Value **Vec, ArrayRef<Instruction *> InVec,
+ unsigned VecElems);
----------------
Make these methods static const or even better move them out of the class
================
Comment at: lib/Target/X86/X86InterleavedAccess.cpp:186
// [0,1...,VF/2-1,VF/2+VF,VF/2+VF+1,...,2VF-1]
- if (DL.getTypeSizeInBits(VecTy) == 768) {
+ unsigned VecLength = DL.getTypeSizeInBits(VecTy);
+ if (VecLength == 768 || VecLength == 1536) {
----------------
To be consistent with the existing code, maybe 'VecWidth' is a better name.
================
Comment at: lib/Target/X86/X86InterleavedAccess.cpp:461
+ unsigned VecElems) {
+ SmallVector<uint32_t, 64> Concat;
+ if (VecElems == 16) {
----------------
Probably more efficient to use a static array and pass a partial chunk as needed:
```
static uint32_t Concat[] = {0, 1, ... 63};
makeArrayRef(Concat, 32);
```
Same for reorder below
================
Comment at: lib/Target/X86/X86InterleavedAccess.cpp:479
+
+ for (unsigned i = 32; i < VecElems; ++i)
+ Concat.push_back(i);
----------------
If we reach this point, can VecElems be other than 64?
================
Comment at: lib/Target/X86/X86InterleavedAccess.cpp:593
// shuffles.
static void genShuffleBland(MVT VT, SmallVectorImpl<uint32_t> &Mask,
SmallVectorImpl<uint32_t> &Out, int LowOffset,
----------------
When you get a chance, please make change Mask to an ArrayRef in a separate NFC commit
================
Comment at: lib/Target/X86/X86InterleavedAccess.cpp:625
+void X86InterleavedAccessGroup::reorderSubVector(
+ MVT VT, SmallVectorImpl<Value *> &TransposedMatrix, Value **Vec,
+ SmallVectorImpl<uint32_t> &VPShuf, unsigned VecElems, unsigned Stride) {
----------------
VPShuf and Vec should be ArrayRef
================
Comment at: lib/Target/X86/X86InterleavedAccess.cpp:653
+ TransposedMatrix[i] =
+ VecElems == 32
+ ? Temp[i]
----------------
Wouldn't this be better?
if (VecElems == 32) {
std::copy(Temp, Temp+3, TransposedMatrix);
return;
}
and handle VecElems > 32 here
https://reviews.llvm.org/D37687
More information about the llvm-commits
mailing list