[PATCH] D13757: [x86] promote 'add nsw' to a wider type to allow more combines
Sanjay Patel via llvm-commits
llvm-commits at lists.llvm.org
Wed Oct 14 16:51:40 PDT 2015
spatel created this revision.
spatel added reviewers: hfinkel, chandlerc, qcolombet, zansari.
spatel added a subscriber: llvm-commits.
Herald added a subscriber: aemerson.
The motivation for this patch starts with PR20134:
https://llvm.org/bugs/show_bug.cgi?id=20134
void foo(int *a, int i) {
a[i] = a[i+1] + a[i+2];
}
It seems better to produce this (14 bytes):
movslq %esi, %rsi
movl 0x4(%rdi,%rsi,4), %eax
addl 0x8(%rdi,%rsi,4), %eax
movl %eax, (%rdi,%rsi,4)
Rather than this (22 bytes):
leal 0x1(%rsi), %eax
cltq
leal 0x2(%rsi), %ecx
movslq %ecx, %rcx
movl (%rdi,%rcx,4), %ecx
addl (%rdi,%rax,4), %ecx
movslq %esi, %rax
movl %ecx, (%rdi,%rax,4)
But it wasn't clear to me where the fix(es) should go, so I tried several things: CodeGenPrepare, DAGCombiner, X86IselLowering, X86ISelDAGToDAG...and finally back to X86ISelLowering because that had the most effect for the least amount of patch. :)
I think the most basic problem (the first test case in the patch combines constants) could also be fixed in InstCombine, but it gets more complicated after that because we need to consider architecture and micro-architecture. For example, I don't think AArch64 sees any benefit from the more general transform because the ISA solves the sexting in hardware. Some x86 chips may not want to replace 2 ADD insts with 1 LEA, and there's an attribute for that: FeatureSlowLEA. But I suspect that doesn't go far enough or maybe it's not getting used when it should; I'm also not sure if FeatureSlowLEA should also mean "slow complex addressing mode".
FWIW, I see no perf differences on test-suite with this change running on AMD Jaguar, and I see only very small code size improvements when building clang and the LLVM tools with the patched compiler. It would be great if someone could try this patch on a recent Intel model to see if it makes any difference. We may want to limit this to optimizing for size and/or modify FeatureSlowLEA if this is a bad change for Intel big cores.
http://reviews.llvm.org/D13757
Files:
lib/Target/X86/X86ISelLowering.cpp
test/CodeGen/X86/add-nsw-sext.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D13757.37416.patch
Type: text/x-patch
Size: 6434 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20151014/5d64b0c1/attachment.bin>
More information about the llvm-commits
mailing list