[LLVMbugs] [Bug 1226] NEW: scalarrepl should be able to scalarrepl aggregates with memcpy uses
bugzilla-daemon at cs.uiuc.edu
bugzilla-daemon at cs.uiuc.edu
Sun Feb 25 12:35:26 PST 2007
http://llvm.org/bugs/show_bug.cgi?id=1226
Summary: scalarrepl should be able to scalarrepl aggregates with
memcpy uses
Product: libraries
Version: 1.0
Platform: All
OS/Version: All
Status: NEW
Severity: enhancement
Priority: P2
Component: Scalar Optimizations
AssignedTo: unassignedbugs at nondot.org
ReportedBy: sabre at nondot.org
Consider:
#include <tr1/functional>
#include <algorithm>
void assign( long* variable, long v) {
std::transform( variable, variable + 1, variable,
std::tr1::bind( std::plus< long >(), 0L, v ) );
}
This compiles to a single store on x86, but a whole ton of code on x86-64. This is because the
temporary structs are larger on x86-64, so EmitAggregateCopy in llvm-gcc emits them as a memcpy
instead of scalar transfers.
The problem is that this later blocks scalarrepl from promoting the structs, causing much worse
codegen:
__Z6assignRll: # x86-32
movl 8(%esp), %eax
movl 4(%esp), %ecx
movl %eax, (%ecx)
ret
__Z6assignRll: # x86-64
subq $88, %rsp
movb $0, 64(%rsp)
movq $0, 72(%rsp)
movq %rsi, 80(%rsp)
movq %rsi, 48(%rsp)
movq 72(%rsp), %rax
movq %rax, 40(%rsp)
movq 64(%rsp), %rax
movq %rax, 32(%rsp)
movq 40(%rsp), %rax
movq %rax, 8(%rsp)
movq 48(%rsp), %rax
movq %rax, 16(%rsp)
movq 32(%rsp), %rax
movq %rax, (%rsp)
movq 16(%rsp), %rax
addq 8(%rsp), %rax
movq %rax, (%rdi)
addq $88, %rsp
ret
-Chris
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
More information about the llvm-bugs
mailing list