[llvm-bugs] [Bug 46801] New: Flag -Oz produces larger binary than -Os

via llvm-bugs llvm-bugs at lists.llvm.org
Tue Jul 21 20:07:57 PDT 2020


https://bugs.llvm.org/show_bug.cgi?id=46801

            Bug ID: 46801
           Summary: Flag -Oz produces larger binary than -Os
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Backend: ARM
          Assignee: unassignedbugs at nondot.org
          Reporter: p.waydan at gmail.com
                CC: llvm-bugs at lists.llvm.org, smithp352 at googlemail.com,
                    Ties.Stuij at arm.com

Created attachment 23764
  --> https://bugs.llvm.org/attachment.cgi?id=23764&action=edit
[llvm-dev] [ARM] Should Use Load and Store with Register Offset

While trying different memcpy implementations, I found that compiling the
following code with -Oz will increase the binary when compared with -Os.

typedef unsigned int size_t;

void* memcpy(void* dst, const void* src, size_t len) {
    char* save = (char*)dst;
    while(--len != (size_t)(-1))
        *((char*)(dst + len)) = *((char*)(src + len));
    return save;
}

Common compile options passed to clang are -S --target=armv6m-none-eabi
-fomit-frame-pointer

Output with -Os
memcpy:
        push    {r4, lr}
        cmp     r2, #0
        beq     .LBB1_3
        subs    r3, r0, #1
        subs    r1, r1, #1
.LBB1_2:
        ldrb    r4, [r1, r2]
        strb    r4, [r3, r2]
        subs    r2, r2, #1
        bne     .LBB1_2
.LBB1_3:
        pop     {r4, pc}

Output with -Oz
memcpy:
        push    {r4, r5, r7, lr}
        subs    r1, r1, #1
        movs    r3, #0
        mvns    r3, r3
.LBB1_1:
        cmp     r2, #0
        beq     .LBB1_3
        subs    r4, r2, #1
        ldrb    r5, [r1, r2]
        adds    r2, r0, r2
        strb    r5, [r2, r3]
        mov     r2, r4
        b       .LBB1_1
.LBB1_3:
        pop     {r4, r5, r7, pc}


The above memcpy implementation copies bytes starting at the high address.
Interestingly, when using a similar implementation which copies bytes starting
at the low address, -Oz reduces code size compared to -Os.

For reference: this code was compiled with clang and llvm built from source
(commit 16a4350f76d2bead7af32617dd557d2ec096d2c5)

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20200722/1cd5c568/attachment.html>


More information about the llvm-bugs mailing list