<html>
    <head>
      <base href="http://llvm.org/bugs/" />
    </head>
    <body><span class="vcard"><a class="email" href="mailto:spatel+llvm@rotateright.com" title="Sanjay Patel <spatel+llvm@rotateright.com>"> <span class="fn">Sanjay Patel</span></a>
</span> changed
              <a class="bz_bug_link 
          bz_status_RESOLVED  bz_closed"
   title="RESOLVED FIXED - poor codegen for unaligned fixed-size memcpy/memmove"
   href="http://llvm.org/bugs/show_bug.cgi?id=21541">bug 21541</a>
        <br>
             <table border="1" cellspacing="0" cellpadding="8">
          <tr>
            <th>What</th>
            <th>Removed</th>
            <th>Added</th>
          </tr>

         <tr>
           <td style="text-align:right;">Status</td>
           <td>NEW
           </td>
           <td>RESOLVED
           </td>
         </tr>

         <tr>
           <td style="text-align:right;">Resolution</td>
           <td>---
           </td>
           <td>FIXED
           </td>
         </tr></table>
      <p>
        <div>
            <b><a class="bz_bug_link 
          bz_status_RESOLVED  bz_closed"
   title="RESOLVED FIXED - poor codegen for unaligned fixed-size memcpy/memmove"
   href="http://llvm.org/bugs/show_bug.cgi?id=21541#c16">Comment # 16</a>
              on <a class="bz_bug_link 
          bz_status_RESOLVED  bz_closed"
   title="RESOLVED FIXED - poor codegen for unaligned fixed-size memcpy/memmove"
   href="http://llvm.org/bugs/show_bug.cgi?id=21541">bug 21541</a>
              from <span class="vcard"><a class="email" href="mailto:spatel+llvm@rotateright.com" title="Sanjay Patel <spatel+llvm@rotateright.com>"> <span class="fn">Sanjay Patel</span></a>
</span></b>
        <pre>16-byte codegen for btver2 fixed with:
<a href="http://llvm.org/viewvc/llvm-project?view=revision&revision=222925">http://llvm.org/viewvc/llvm-project?view=revision&revision=222925</a>

For the original code example in this bug report using clang built from
r223054, we now generate:

$ ./clang -O3 -fomit-frame-pointer -march=btver2 -c 21541.c -S -o -
    .section    __TEXT,__text,regular,pure_instructions
    .macosx_version_min 10, 10
    .globl    _copy32byte
    .align    4, 0x90
_copy32byte:                            ## @copy32byte
    .cfi_startproc
## BB#0:                                ## %entry
    vmovups    (%rsi), %ymm0
    vmovups    %ymm0, (%rdi)
    vzeroupper
    retq

------------------------------------------------------------------------

Resolving as fixed since we're using 32-byte memops now. 

I've seen some codegen variability between "vmovups" and "vmovdqu" that I can't
explain yet. I don't think there will be any perf difference between those 2
insts for a simple copy based on my testing or the docs, but if there is, we
should open a new bug.</pre>
        </div>
      </p>
      <hr>
      <span>You are receiving this mail because:</span>
      
      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>