[PATCH] D44269: [X86] Remove sse41 specific code from lowering v16i8 multiply
Craig Topper via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Mar 8 11:45:04 PST 2018
craig.topper created this revision.
craig.topper added reviewers: RKSimon, spatel.
With the SRAs removed from the SSE2 code as proposed in https://reviews.llvm.org/D44267, then there doesn't appear to be any advantage to the sse41 code. The punpcklbw instruction and pmovsx seem to have the same latency and throughput on most CPUs. And the SSE41 code requires moving the upper 64-bits into the lower 64-bit before the sign extend can be done. The unpckhbw in sse2 code can do better than that.
https://reviews.llvm.org/D44269
Files:
lib/Target/X86/X86ISelLowering.cpp
test/CodeGen/X86/combine-mul.ll
test/CodeGen/X86/pmul.ll
test/CodeGen/X86/vector-idiv-sdiv-128.ll
test/CodeGen/X86/vector-idiv-sdiv-256.ll
test/CodeGen/X86/vector-idiv-udiv-128.ll
test/CodeGen/X86/vector-idiv-udiv-256.ll
test/CodeGen/X86/vector-mul.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D44269.137624.patch
Type: text/x-patch
Size: 44904 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180308/b32a417e/attachment.bin>
More information about the llvm-commits
mailing list