[llvm-commits] [llvm] r169791 - in /llvm/trunk: include/llvm/Target/ lib/CodeGen/SelectionDAG/ lib/Target/ARM/ lib/Target/Mips/ lib/Target/X86/ test/CodeGen/ARM/ test/CodeGen/X86/
Eli Friedman
eli.friedman at gmail.com
Tue Dec 11 13:48:04 PST 2012
On Mon, Dec 10, 2012 at 5:35 PM, Evan Cheng <evan.cheng at apple.com> wrote:
>
> On Dec 10, 2012, at 3:35 PM, Eli Friedman <eli.friedman at gmail.com> wrote:
>
>> On Mon, Dec 10, 2012 at 3:21 PM, Evan Cheng <evan.cheng at apple.com> wrote:
>>> Author: evancheng
>>> Date: Mon Dec 10 17:21:26 2012
>>> New Revision: 169791
>>>
>>> URL: http://llvm.org/viewvc/llvm-project?rev=169791&view=rev
>>> Log:
>>> Some enhancements for memcpy / memset inline expansion.
>>> 1. Teach it to use overlapping unaligned load / store to copy / set the trailing
>>> bytes. e.g. On 86, use two pairs of movups / movaps for 17 - 31 byte copies.
>>> 2. Use f64 for memcpy / memset on targets where i64 is not legal but f64 is. e.g.
>>> x86 and ARM.
>>
>> This won't work correctly on x86 if we don't have SSE2. (Loading an
>> f64 into an x87 register is a lossy operation.)
>
> That should not happen with this patch.
No?
$ clang -S -o - -x c -m32 -msse -mno-sse2 - -O2 -march=corei7-avx
#include <stdlib.h>
#include <string.h>
void f(void* a, void* b) {
memcpy(a,b,24);
}
^D .section __TEXT,__text,regular,pure_instructions
.globl _f
.align 4, 0x90
_f: ## @f
## BB#0: ## %cond.end
pushl %ebp
movl %esp, %ebp
movl 12(%ebp), %eax
fldl 16(%eax)
movl 8(%ebp), %ecx
fstpl 16(%ecx)
movups (%eax), %xmm0
movups %xmm0, (%ecx)
popl %ebp
ret
.subsections_via_symbols
-Eli
More information about the llvm-commits
mailing list