[PATCH] D86025: [CodeGen] Respect libfunc availability when lowering intrinsic memcpy

Mon Aug 17 15:36:59 PDT 2020

efriedma added a comment.

In D86025#2222371 <https://reviews.llvm.org/D86025#2222371>, @mstorsjo wrote:

> In D86025#2222132 <https://reviews.llvm.org/D86025#2222132>, @efriedma wrote:
>
>> Conventionally, we've assumed that memcpy is available despite -fno-builtin etc., simply because generating it inline is too painful.  For example, consider what happens to your testcase if +strict-align is specified.  (This is following gcc, which has a similar expectation.)
>
> Right... The actual case I'm trying to fix is on x86_64, and there gcc doesn't seem to generate a memcpy lightly, but for aarch64 it does indeed seem to generate such a call despite -fno-builtin, just as you say. (The reason for having the testcase for aarch64 is that the fix needed separate cases for selectiondag/fastisel/globalisel, so testing it on aarch64 felt easiest.)

gcc will generate memcpy on x86 too; the threshold is just ridiculously high for -mtune=generic.  (I don't think this choice is affected by -fno-builtin.)

I really don't really want to go down this path of inlining all memcpys for CPU targets: the code for an efficient memcpy is way too big to reasonably inline, particularly on non-x86 targets.

Note that LLVM optimizations will currently avoid forming memcpy under -fno-builtin (mostly to avoid weird issues with implementing memcpy).  This doesn't affect clang.

>> If the requirement of having a symbol named "memcpy" is problematic, I guess we could look into making compiler-rt provide a symbol with equivalent semantics, but use a compiler-reserved name so the user can't disable it with -fno-builtin. For example, __aeabi_memcpy is defined on ARM.
>
> Actually, the issue isn't that memcpy is missing, but that it has a different calling convention. In wine, certain DLLs (which are built as a regular ELF .so or MachO .dylib) would only include windows mode headers, and use a memcpy function with windows calling conventions. As long as memcpy is called explicitly, it ends up fine, but for the cases when the compiler backend invents memcpy calls, they end up using the platform's default calling convention.
>
> This is mostly an issue on x86_64, where the calling conventions differ significantly - on other architectures the calling conventions are similar enough that there's no difference in a memcpy call.

You mean, the header explicitly declares a function named memcpy, and uses an attribute to explicitly overrides the calling convention so it's not the same as the "C" calling convention?  Messing with the backend's calling convention for memcpy isn't hard, but I'm not sure what conditions would be appropriate.  We already support `clang --target=x86_64-pc-win32-elf`.

> In any case, a separate builtin function to generate a call to would probably help here as well, but doing that would break cases when building with clang but linking libgcc, no?

Yes.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86025/new/

https://reviews.llvm.org/D86025