<html>

    <head>

      <base href="https://bugs.llvm.org/">

    </head>

    <body><table border="1" cellspacing="0" cellpadding="8">

        <tr>

          <th>Bug ID</th>

          <td><a class="bz_bug_link 

          bz_status_NEW "

   title="NEW - Flag -Oz produces larger binary than -Os"

   href="https://bugs.llvm.org/show_bug.cgi?id=46801">46801</a>

          </td>

        </tr>

        <tr>

          <th>Summary</th>

          <td>Flag -Oz produces larger binary than -Os

          </td>

        </tr>

        <tr>

          <th>Product</th>

          <td>libraries

          </td>

        </tr>

        <tr>

          <th>Version</th>

          <td>trunk

          </td>

        </tr>

        <tr>

          <th>Hardware</th>

          <td>PC

          </td>

        </tr>

        <tr>

          <th>OS</th>

          <td>Linux

          </td>

        </tr>

        <tr>

          <th>Status</th>

          <td>NEW

          </td>

        </tr>

        <tr>

          <th>Severity</th>

          <td>enhancement

          </td>

        </tr>

        <tr>

          <th>Priority</th>

          <td>P

          </td>

        </tr>

        <tr>

          <th>Component</th>

          <td>Backend: ARM

          </td>

        </tr>

        <tr>

          <th>Assignee</th>

          <td>unassignedbugs@nondot.org

          </td>

        </tr>

        <tr>

          <th>Reporter</th>

          <td>p.waydan@gmail.com

          </td>

        </tr>

        <tr>

          <th>CC</th>

          <td>llvm-bugs@lists.llvm.org, smithp352@googlemail.com, Ties.Stuij@arm.com

          </td>

        </tr></table>

      <p>

        <div>

        <pre>Created <span class=""><a href="attachment.cgi?id=23764" name="attach_23764" title="[llvm-dev] [ARM] Should Use Load and Store with Register Offset">attachment 23764</a> <a href="attachment.cgi?id=23764&action=edit" title="[llvm-dev] [ARM] Should Use Load and Store with Register Offset">[details]</a></span>

[llvm-dev] [ARM] Should Use Load and Store with Register Offset

While trying different memcpy implementations, I found that compiling the

following code with -Oz will increase the binary when compared with -Os.

typedef unsigned int size_t;

void* memcpy(void* dst, const void* src, size_t len) {

    char* save = (char*)dst;

    while(--len != (size_t)(-1))

        *((char*)(dst + len)) = *((char*)(src + len));

    return save;

}

Common compile options passed to clang are -S --target=armv6m-none-eabi

-fomit-frame-pointer

Output with -Os

memcpy:

        push    {r4, lr}

        cmp     r2, #0

        beq     .LBB1_3

        subs    r3, r0, #1

        subs    r1, r1, #1

.LBB1_2:

        ldrb    r4, [r1, r2]

        strb    r4, [r3, r2]

        subs    r2, r2, #1

        bne     .LBB1_2

.LBB1_3:

        pop     {r4, pc}

Output with -Oz

memcpy:

        push    {r4, r5, r7, lr}

        subs    r1, r1, #1

        movs    r3, #0

        mvns    r3, r3

.LBB1_1:

        cmp     r2, #0

        beq     .LBB1_3

        subs    r4, r2, #1

        ldrb    r5, [r1, r2]

        adds    r2, r0, r2

        strb    r5, [r2, r3]

        mov     r2, r4

        b       .LBB1_1

.LBB1_3:

        pop     {r4, r5, r7, pc}

The above memcpy implementation copies bytes starting at the high address.

Interestingly, when using a similar implementation which copies bytes starting

at the low address, -Oz reduces code size compared to -Os.

For reference: this code was compiled with clang and llvm built from source

(commit 16a4350f76d2bead7af32617dd557d2ec096d2c5)</pre>

        </div>

      </p>

      <hr>

      <span>You are receiving this mail because:</span>

      <ul>

          <li>You are on the CC list for the bug.</li>

      </ul>

    </body>

</html>