[LLVMbugs] [Bug 1226] NEW: scalarrepl should be able to scalarrepl aggregates with memcpy uses

Sun Feb 25 12:35:26 PST 2007

http://llvm.org/bugs/show_bug.cgi?id=1226

           Summary: scalarrepl should be able to scalarrepl aggregates with
                    memcpy uses
           Product: libraries
           Version: 1.0
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: Scalar Optimizations
        AssignedTo: unassignedbugs at nondot.org
        ReportedBy: sabre at nondot.org

Consider:

#include <tr1/functional>
#include <algorithm>
void assign( long* variable, long v) {
        std::transform( variable, variable + 1, variable,
                std::tr1::bind( std::plus< long >(), 0L, v ) );
}

This compiles to a single store on x86, but a whole ton of code on x86-64.  This is because the 
temporary structs are larger on x86-64, so EmitAggregateCopy in llvm-gcc emits them as a memcpy 
instead of scalar transfers.

The problem is that this later blocks scalarrepl from promoting the structs, causing much worse 
codegen:

__Z6assignRll:    # x86-32
        movl 8(%esp), %eax
        movl 4(%esp), %ecx
        movl %eax, (%ecx)
        ret

__Z6assignRll:   # x86-64
        subq $88, %rsp
        movb $0, 64(%rsp)
        movq $0, 72(%rsp)
        movq %rsi, 80(%rsp)
        movq %rsi, 48(%rsp)
        movq 72(%rsp), %rax
        movq %rax, 40(%rsp)
        movq 64(%rsp), %rax
        movq %rax, 32(%rsp)
        movq 40(%rsp), %rax
        movq %rax, 8(%rsp)
        movq 48(%rsp), %rax
        movq %rax, 16(%rsp)
        movq 32(%rsp), %rax
        movq %rax, (%rsp)
        movq 16(%rsp), %rax
        addq 8(%rsp), %rax
        movq %rax, (%rdi)
        addq $88, %rsp
        ret

-Chris

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.