[PATCH] D26590: [X86][SSE] Improve lowering of vXi64 multiply with known zero 32-bit halves
Simon Pilgrim via llvm-commits
llvm-commits at lists.llvm.org
Sun Nov 13 08:10:57 PST 2016
RKSimon created this revision.
RKSimon added reviewers: mkuper, craig.topper, spatel, andreadb.
RKSimon added a subscriber: llvm-commits.
RKSimon set the repository for this revision to rL LLVM.
vXi64 multiplication is lowered into 3 calls of vpmuludq with the upper/lower 32-bit halves.
If any of these halves are zero then we can remove individual calls. Although there was isBuildVectorAllZeros code to somewhat do this I don't think it ever worked (maybe just for constant folded cases that don't seem to be tested for any longer).
This requires additional X86ISD support for computeKnownBitsForTargetNode, so far I've just added support for X86ISD::VZEXT (VPMOVZX* - helping the AVX2+ cases), I can add further support (X86 target shuffles and bit shifts) in future commits to help SSE2-AVX1 cases.
Fix for PR30845
Repository:
rL LLVM
https://reviews.llvm.org/D26590
Files:
lib/Target/X86/X86ISelLowering.cpp
test/CodeGen/X86/pmul.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D26590.77747.patch
Type: text/x-patch
Size: 8023 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20161113/b2a9ebc1/attachment.bin>
More information about the llvm-commits
mailing list