<html>
    <head>
      <base href="https://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - Flag -Oz produces larger binary than -Os"
   href="https://bugs.llvm.org/show_bug.cgi?id=46801">46801</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>Flag -Oz produces larger binary than -Os
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>libraries
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>enhancement
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Backend: ARM
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>p.waydan@gmail.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvm-bugs@lists.llvm.org, smithp352@googlemail.com, Ties.Stuij@arm.com
          </td>
        </tr></table>
      <p>
        <div>
        <pre>Created <span class=""><a href="attachment.cgi?id=23764" name="attach_23764" title="[llvm-dev] [ARM] Should Use Load and Store with Register Offset">attachment 23764</a> <a href="attachment.cgi?id=23764&action=edit" title="[llvm-dev] [ARM] Should Use Load and Store with Register Offset">[details]</a></span>
[llvm-dev] [ARM] Should Use Load and Store with Register Offset

While trying different memcpy implementations, I found that compiling the
following code with -Oz will increase the binary when compared with -Os.

typedef unsigned int size_t;

void* memcpy(void* dst, const void* src, size_t len) {
    char* save = (char*)dst;
    while(--len != (size_t)(-1))
        *((char*)(dst + len)) = *((char*)(src + len));
    return save;
}

Common compile options passed to clang are -S --target=armv6m-none-eabi
-fomit-frame-pointer

Output with -Os
memcpy:
        push    {r4, lr}
        cmp     r2, #0
        beq     .LBB1_3
        subs    r3, r0, #1
        subs    r1, r1, #1
.LBB1_2:
        ldrb    r4, [r1, r2]
        strb    r4, [r3, r2]
        subs    r2, r2, #1
        bne     .LBB1_2
.LBB1_3:
        pop     {r4, pc}

Output with -Oz
memcpy:
        push    {r4, r5, r7, lr}
        subs    r1, r1, #1
        movs    r3, #0
        mvns    r3, r3
.LBB1_1:
        cmp     r2, #0
        beq     .LBB1_3
        subs    r4, r2, #1
        ldrb    r5, [r1, r2]
        adds    r2, r0, r2
        strb    r5, [r2, r3]
        mov     r2, r4
        b       .LBB1_1
.LBB1_3:
        pop     {r4, r5, r7, pc}


The above memcpy implementation copies bytes starting at the high address.
Interestingly, when using a similar implementation which copies bytes starting
at the low address, -Oz reduces code size compared to -Os.

For reference: this code was compiled with clang and llvm built from source
(commit 16a4350f76d2bead7af32617dd557d2ec096d2c5)</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>