[LLVMbugs] [Bug 21541] poor codegen for unaligned fixed-size memcpy/memmove

bugzilla-daemon at llvm.org bugzilla-daemon at llvm.org
Mon Dec 1 11:13:23 PST 2014


Sanjay Patel <spatel+llvm at rotateright.com> changed:

           What    |Removed                     |Added
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #16 from Sanjay Patel <spatel+llvm at rotateright.com> ---
16-byte codegen for btver2 fixed with:

For the original code example in this bug report using clang built from
r223054, we now generate:

$ ./clang -O3 -fomit-frame-pointer -march=btver2 -c 21541.c -S -o -
    .section    __TEXT,__text,regular,pure_instructions
    .macosx_version_min 10, 10
    .globl    _copy32byte
    .align    4, 0x90
_copy32byte:                            ## @copy32byte
## BB#0:                                ## %entry
    vmovups    (%rsi), %ymm0
    vmovups    %ymm0, (%rdi)


Resolving as fixed since we're using 32-byte memops now. 

I've seen some codegen variability between "vmovups" and "vmovdqu" that I can't
explain yet. I don't think there will be any perf difference between those 2
insts for a simple copy based on my testing or the docs, but if there is, we
should open a new bug.

You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20141201/e57523a2/attachment.html>

More information about the llvm-bugs mailing list