[PATCH] D35638: A fix for bug33826

David Kreitzer via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jul 19 12:30:35 PDT 2017


DavidKreitzer added inline comments.


================
Comment at: lib/Target/X86/X86InterleavedAccess.cpp:108
+    // If load size is less than Factor * ShuffleVecSize, transpose will not be
+    // not be profitable.
+    if (DL.getTypeSizeInBits(Inst->getType()) < Factor * ShuffleVecSize)
----------------
It is not just a question of profitability. If lowerIntoOptimizedSequence were called for a load instruction that is too small, it would make an incorrect transformation, because the decompose function would generate Factor loads, each of size ShuffleVecSize.

I would also recommend a slightly different fix. Rather than checking the expected shuffle size for both the load & the store, I would check the "expected wide vector size". For loads, that means checking the type of the load. For stores, it means checking the type of the shuffle.



https://reviews.llvm.org/D35638





More information about the llvm-commits mailing list