<html>
<head>
<base href="http://llvm.org/bugs/" />
</head>
<body><span class="vcard"><a class="email" href="mailto:spatel+llvm@rotateright.com" title="Sanjay Patel <spatel+llvm@rotateright.com>"> <span class="fn">Sanjay Patel</span></a>
</span> changed
<a class="bz_bug_link
bz_status_RESOLVED bz_closed"
title="RESOLVED FIXED - poor codegen for unaligned fixed-size memcpy/memmove"
href="http://llvm.org/bugs/show_bug.cgi?id=21541">bug 21541</a>
<br>
<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>What</th>
<th>Removed</th>
<th>Added</th>
</tr>
<tr>
<td style="text-align:right;">Status</td>
<td>NEW
</td>
<td>RESOLVED
</td>
</tr>
<tr>
<td style="text-align:right;">Resolution</td>
<td>---
</td>
<td>FIXED
</td>
</tr></table>
<p>
<div>
<b><a class="bz_bug_link
bz_status_RESOLVED bz_closed"
title="RESOLVED FIXED - poor codegen for unaligned fixed-size memcpy/memmove"
href="http://llvm.org/bugs/show_bug.cgi?id=21541#c16">Comment # 16</a>
on <a class="bz_bug_link
bz_status_RESOLVED bz_closed"
title="RESOLVED FIXED - poor codegen for unaligned fixed-size memcpy/memmove"
href="http://llvm.org/bugs/show_bug.cgi?id=21541">bug 21541</a>
from <span class="vcard"><a class="email" href="mailto:spatel+llvm@rotateright.com" title="Sanjay Patel <spatel+llvm@rotateright.com>"> <span class="fn">Sanjay Patel</span></a>
</span></b>
<pre>16-byte codegen for btver2 fixed with:
<a href="http://llvm.org/viewvc/llvm-project?view=revision&revision=222925">http://llvm.org/viewvc/llvm-project?view=revision&revision=222925</a>
For the original code example in this bug report using clang built from
r223054, we now generate:
$ ./clang -O3 -fomit-frame-pointer -march=btver2 -c 21541.c -S -o -
.section __TEXT,__text,regular,pure_instructions
.macosx_version_min 10, 10
.globl _copy32byte
.align 4, 0x90
_copy32byte: ## @copy32byte
.cfi_startproc
## BB#0: ## %entry
vmovups (%rsi), %ymm0
vmovups %ymm0, (%rdi)
vzeroupper
retq
------------------------------------------------------------------------
Resolving as fixed since we're using 32-byte memops now.
I've seen some codegen variability between "vmovups" and "vmovdqu" that I can't
explain yet. I don't think there will be any perf difference between those 2
insts for a simple copy based on my testing or the docs, but if there is, we
should open a new bug.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>