[llvm-commits] [llvm] r169791 - in /llvm/trunk: include/llvm/Target/ lib/CodeGen/SelectionDAG/ lib/Target/ARM/ lib/Target/Mips/ lib/Target/X86/ test/CodeGen/ARM/ test/CodeGen/X86/
Evan Cheng
evan.cheng at apple.com
Tue Dec 11 16:43:16 PST 2012
r169944
Evan
On Dec 11, 2012, at 1:48 PM, Eli Friedman <eli.friedman at gmail.com> wrote:
> On Mon, Dec 10, 2012 at 5:35 PM, Evan Cheng <evan.cheng at apple.com> wrote:
>>
>> On Dec 10, 2012, at 3:35 PM, Eli Friedman <eli.friedman at gmail.com> wrote:
>>
>>> On Mon, Dec 10, 2012 at 3:21 PM, Evan Cheng <evan.cheng at apple.com> wrote:
>>>> Author: evancheng
>>>> Date: Mon Dec 10 17:21:26 2012
>>>> New Revision: 169791
>>>>
>>>> URL: http://llvm.org/viewvc/llvm-project?rev=169791&view=rev
>>>> Log:
>>>> Some enhancements for memcpy / memset inline expansion.
>>>> 1. Teach it to use overlapping unaligned load / store to copy / set the trailing
>>>> bytes. e.g. On 86, use two pairs of movups / movaps for 17 - 31 byte copies.
>>>> 2. Use f64 for memcpy / memset on targets where i64 is not legal but f64 is. e.g.
>>>> x86 and ARM.
>>>
>>> This won't work correctly on x86 if we don't have SSE2. (Loading an
>>> f64 into an x87 register is a lossy operation.)
>>
>> That should not happen with this patch.
>
> No?
>
> $ clang -S -o - -x c -m32 -msse -mno-sse2 - -O2 -march=corei7-avx
> #include <stdlib.h>
> #include <string.h>
> void f(void* a, void* b) {
> memcpy(a,b,24);
> }
> ^D .section __TEXT,__text,regular,pure_instructions
> .globl _f
> .align 4, 0x90
> _f: ## @f
> ## BB#0: ## %cond.end
> pushl %ebp
> movl %esp, %ebp
> movl 12(%ebp), %eax
> fldl 16(%eax)
> movl 8(%ebp), %ecx
> fstpl 16(%ecx)
> movups (%eax), %xmm0
> movups %xmm0, (%ecx)
> popl %ebp
> ret
>
>
> .subsections_via_symbols
>
>
> -Eli
More information about the llvm-commits
mailing list