[PATCH] D52081: [InstCombine] do not expand 8 byte memcpy if optimising for minsize
Roman Lebedev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Sep 14 04:51:32 PDT 2018
lebedev.ri added a comment.
In https://reviews.llvm.org/D52081#1234686, @SjoerdMeijer wrote:
> Thanks both, those are fair points.
>
> For a bit more context, this is the problem that I am trying to solve:
>
> void foo (char *A, char *B) {
> memcpy(A, B, 8);
> }
>
>
> compiled with `-Oz -mno-unaligned-access` this results in this disaster:
>
> ldrb r3, [r1]
> ldrb r4, [r1, #1]
> ldrb r5, [r1, #2]
> ldrb r6, [r1, #3]
> ldrb r1, [r1, #5]
> ldrb lr, [r2, #4]!
> ldrb.w r12, [r2, #2]
> ldrb r2, [r2, #3]
> strb r1, [r0, #5]
> strb r6, [r0, #3]
> strb r5, [r0, #2]
> strb r4, [r0, #1]
> strb r3, [r0]
> strb lr, [r0, #4]!
> strb r2, [r0, #3]
> strb.w r12, [r0, #2]
> ldr r11, [sp], #4
>
>
> but forgetting about this no-unaligned case, we see that with alignment support the code bloat is already there:
>
> ldr r2, [r1]
> ldr r1, [r1, #4]
> str r1, [r0, #4]
> str r2, [r0]
> bx lr
>
>
> So, for the decision making here in InstCombine, which is mostly target independent at the moment, I would like to ignore the whole aligned/unaligned business. And what I want to generate is of course just this:
>
> movs r2, #8
> b __aeabi_memcpy
Are you sure you shouldn't be fixing this in the backend?
> Now, surprisingly, this is also what we generate for X86 and AArch64 with `-Oz`, whereas we would perhaps expect a load and a store on these 64-bit architectures? I don't know why that is not happening, if there is a reason for, and I need to look into that.
>
> Either way, this patch generates the same code, and is consistent with that. And I think the hard coding of size > 4 is mostly inline with the some of these checks already there.
https://reviews.llvm.org/D52081
More information about the llvm-commits
mailing list