[LLVMbugs] [Bug 6623] New: clang/llvm expands memcpy thus making the resulting code big

Mon Mar 15 12:01:17 PDT 2010

http://llvm.org/bugs/show_bug.cgi?id=6623

           Summary: clang/llvm expands memcpy thus making the resulting
                    code big
           Product: libraries
           Version: trunk
          Platform: PC
        OS/Version: FreeBSD
            Status: NEW
          Severity: normal
          Priority: P
         Component: Backend: X86
        AssignedTo: unassignedbugs at nondot.org
        ReportedBy: rdivacky at freebsd.org
                CC: llvmbugs at cs.uiuc.edu

pes delta$ clang -c -std=c99 -fno-inline -Os boot2.c && nm -S boot2.o | grep "t
fsread"

.... warnings ....

000000000000000e 0000000000000112 t fsread
pes delta$ gcc -c -std=c99 -fno-inline -Os boot2.c && nm -S boot2.o | grep "t
fsread"

.... warnings ....

0000000000000000 00000000000000cd t fsread

clang code is MUCH larger.. this is because gcc uses memcpy

        imulq   $120, %rax, %rsi
        movl    $120, %edx
        movl    $dp2.1791, %edi
        addq    %rcx, %rsi
        call    memcpy

while clang unrolls that memcpy:

        movq    112, %rax
        movq    %rax, fsread.dp2+112(%rip)
        movq    104, %rax
        movq    %rax, fsread.dp2+104(%rip)
        movq    96, %rax
        movq    %rax, fsread.dp2+96(%rip)
        movq    88, %rax
        movq    %rax, fsread.dp2+88(%rip)
        movq    80, %rax
        movq    %rax, fsread.dp2+80(%rip)
        movq    72, %rax
        movq    %rax, fsread.dp2+72(%rip)
        movq    64, %rax
        movq    %rax, fsread.dp2+64(%rip)
        movq    56, %rax
        movq    %rax, fsread.dp2+56(%rip)
        movq    48, %rax
        movq    %rax, fsread.dp2+48(%rip)
        movq    40, %rax
        movq    %rax, fsread.dp2+40(%rip)
        movq    32, %rax
        movq    %rax, fsread.dp2+32(%rip)
        movq    24, %rax
        movq    %rax, fsread.dp2+24(%rip)
        movq    16, %rax
        movq    %rax, fsread.dp2+16(%rip)
        movq    0, %rax
        movq    8, %rcx
        movq    %rcx, fsread.dp2+8(%rip)
        movq    %rax, fsread.dp2(%rip)

I need the generated code to be as small as possible and this effectively
prevents me from that. can I turn this optimization off somehow?

note that this happens at -O0 too

-- 
Configure bugmail: http://llvm.org/bugs/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.