[PATCH] D35638: A fix for bug33826
David Kreitzer via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Jul 19 12:30:35 PDT 2017
DavidKreitzer added inline comments.
================
Comment at: lib/Target/X86/X86InterleavedAccess.cpp:108
+ // If load size is less than Factor * ShuffleVecSize, transpose will not be
+ // not be profitable.
+ if (DL.getTypeSizeInBits(Inst->getType()) < Factor * ShuffleVecSize)
----------------
It is not just a question of profitability. If lowerIntoOptimizedSequence were called for a load instruction that is too small, it would make an incorrect transformation, because the decompose function would generate Factor loads, each of size ShuffleVecSize.
I would also recommend a slightly different fix. Rather than checking the expected shuffle size for both the load & the store, I would check the "expected wide vector size". For loads, that means checking the type of the load. For stores, it means checking the type of the shuffle.
https://reviews.llvm.org/D35638
More information about the llvm-commits
mailing list