[LLVMdev] Lowering "memcpy" intrinsic function on ARM using LDMIA/STMIA
Andrew Trick
atrick at apple.com
Thu Feb 10 22:07:53 PST 2011
On Feb 9, 2011, at 5:02 AM, Vasiliy Korchagin wrote:
> llvm emits code for "memcpy" on ARM as consecutive ldr/str commands, and further combines them into ldm/stm with special pass after register allocation. But ldm/stm commands require registers to go in ascending order, what is often not so after regalloc, therefore some str/ldr commands. For example such code:
>
> struct Foo {int a, b, c, d; }
> void CopyStruct(struct Foo *a, struct Foo *b) { *a = *b; }
>
> compiled to:
>
> ldmia r1, {r2, r3, r12}
> ldr r1, [r1, #12]
> stmia r0, {r2, r3, r12}
> str r1, [r0, #12]
> bx lr
>
> I ran different tests and always regalloc allocates at least one register not in ascending order.
>
> What is your ideas to overcome this issue? Maybe llvm should emit code for "memcpy" straight into ldm/stm or exchange registers before combining ldr/str to make them go in ascending order or fix somehow register allocator?
Hi Vasiliy,
We should handle this better. I'm not sure how to guarantee that we can generate ldm/stm without regalloc support. Our only idea is to teach the new register allocator to do a much better job satisfying register hints. If you'd like to track this, feel free to file a bug.
Thanks,
-Andy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110210/96323a81/attachment.html>
More information about the llvm-dev
mailing list