[llvm-bugs] [Bug 25261] New: [x86] hoist complex address generation common subexpressions?
via llvm-bugs
llvm-bugs at lists.llvm.org
Tue Oct 20 08:52:13 PDT 2015
https://llvm.org/bugs/show_bug.cgi?id=25261
Bug ID: 25261
Summary: [x86] hoist complex address generation common
subexpressions?
Product: libraries
Version: trunk
Hardware: PC
OS: All
Status: NEW
Severity: normal
Priority: P
Component: Backend: X86
Assignee: unassignedbugs at nondot.org
Reporter: spatel+llvm at rotateright.com
CC: llvm-bugs at lists.llvm.org
Classification: Unclassified
There's a potential optimization remaining from the test case in bug 20134:
void foo(int *a, int i) {
a[i] = a[i+1] + a[i+2];
}
After:
http://llvm.org/viewvc/llvm-project?view=revision&revision=250560
We generate:
00 movslq %esi, %rax
03 movl 0x4(%rdi,%rax,4), %ecx
07 addl 0x8(%rdi,%rax,4), %ecx
0b movl %ecx, (%rdi,%rax,4)
0e retq
We could hoist the common factor (%rdi,%rsi,4) out of the load and store
addresses:
00 movslq %esi, %rax
03 leaq (%rdi,%rax,4), %rax
07 movl 0x4(%rax), %ecx
0a addl 0x8(%rax), %ecx
0d movl %ecx, (%rax)
0f retq
This probably requires simulation and micro-benchmarking across
micro-architectures to decide if it's worthwhile. The LEA option appears to
increase code size by 1 byte (hand-hacked asm - didn't verify that it's
correct) for this case where the CSE is used 3 times. In the general case,
using an LEA may also require an additional register. Also, whether the simpler
address calcs actually provide a time-savings or not may be
implementation-dependent.
The solution to this problem may be related to whatever is implemented for bug
24447.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20151020/185a18d6/attachment-0001.html>
More information about the llvm-bugs
mailing list