[PATCH] D36036: Supported interleaved byte load-pattern of stride:4 VF(8, 16, 32).
Simon Pilgrim via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Aug 1 12:01:42 PDT 2017
RKSimon added a comment.
Submit the tests with current codegen so that the patch shows the diffs
================
Comment at: lib/Analysis/VectorUtils.cpp:581
+ VectorType *VecTy = cast<VectorType>(V->getType());
+ assert(NumElements <= VecTy->getNumElements() && "Too many elements!");
+
----------------
Assert that (BeginIndex + NumElements) <= VecTy->getNumElements() ?
================
Comment at: lib/Target/X86/X86ISelLowering.h:1453
void createUnpackShuffleMask(MVT VT, SmallVectorImpl<T> &Mask, bool Lo,
- bool Unary) {
- assert(Mask.empty() && "Expected an empty shuffle mask vector");
+ bool Unary, unsigned VecLen = 128,
+ unsigned NumEltsToUnpack = 1) {
----------------
VecLenInBits to make it clearer
================
Comment at: lib/Target/X86/X86ISelLowering.h:1458
+
+ for (int i = 0; i < NumElts / NumEltsToUnpack; ++i) {
unsigned LaneStart = (i / NumEltsInLane) * NumEltsInLane;
----------------
```
for (int i = 0, e = NumElts / NumEltsToUnpack; i < e; ++i) {
```
https://reviews.llvm.org/D36036
More information about the llvm-commits
mailing list