[PATCH] D34005: [CGP / PowerPC] avoid multi-block overhead for simple memcmp expansion
Sanjay Patel via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Jun 7 11:08:25 PDT 2017
spatel created this revision.
Herald added a subscriber: mcrosier.
The test diff for PowerPC is minimal, but for x86, there's a substantial difference because branches are assumed cheap and SDAG can't optimize across blocks. Instead of this:
_cmp_eq8:
movq (%rdi), %rax
cmpq (%rsi), %rax
je LBB23_1
## BB#2: ## %res_block
movl $1, %ecx
jmp LBB23_3
LBB23_1:
xorl %ecx, %ecx
LBB23_3: ## %endblock
xorl %eax, %eax
testl %ecx, %ecx
sete %al
retq
We get this:
cmp_eq8:
movq (%rdi), %rcx
xorl %eax, %eax
cmpq (%rsi), %rcx
sete %al
retq
And that matches the optimal codegen that we get from the current expansion in SelectionDAGBuilder::visitMemCmpCall(). If this looks right, then I just need to confirm that vector-sized expansion will work from here, and we can enable CGP memcmp() expansion for x86. Ie, we'll bypass the power-of-2 special cases currently optimized in SDAG because we can lower the IR produced here optimally.
https://reviews.llvm.org/D34005
Files:
lib/CodeGen/CodeGenPrepare.cpp
test/CodeGen/PowerPC/memCmpUsedInZeroEqualityComparison.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D34005.101780.patch
Type: text/x-patch
Size: 5256 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170607/e01c7c3f/attachment.bin>
More information about the llvm-commits
mailing list