[PATCH] D86025: [CodeGen] Respect libfunc availability when lowering intrinsic memcpy

Martin Storsjö via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Aug 18 01:30:54 PDT 2020


mstorsjo added a comment.

In D86025#2222500 <https://reviews.llvm.org/D86025#2222500>, @arsenm wrote:

> The actual expansion code (at least in the non-loop case) is already implemented, it's just in the wrong place. Currently the memory intrinsics can be expanded by the code in CombinerHelper::optimizeMemcpy/optimizeMemmove/optimizeMemset, and treat this as an optimization. I think this should be treated as a legalization decision, and moved into LegalizerHelper. LegalizerHelper::lower would then dispatch to a lowerMemcpy(), which would look similar to optimizeMemcpy today. After that, more code would be needed for the loop cases. I'll probably get to this in a month or two, but if you want to take care of it that would be great.

Thanks for the pointers! I think I might hold off of this for now though - since the target where it's needed is x86_64, I'd primarily only need it in SelectionDAG, but I tried to fix it consistently in all cases at once.

In D86025#2222728 <https://reviews.llvm.org/D86025#2222728>, @efriedma wrote:

> gcc will generate memcpy on x86 too; the threshold is just ridiculously high for -mtune=generic.  (I don't think this choice is affected by -fno-builtin.)

That does indeed seem to be the case. For `-mtune=generic`, it seems to generate something that boils down to `rep movsq` - and you're right that `-fno-builtin` doesn't seem to affect it.

> I really don't really want to go down this path of inlining all memcpys for CPU targets: the code for an efficient memcpy is way too big to reasonably inline, particularly on non-x86 targets.

Do you mean there's a lot of code, built with `-fno-builtin`, where the code itself didn't mention memcpy, but where a call to it ideally should be generated? (Not questioning the assumption, just trying to understand the concern better.)

> Note that LLVM optimizations will currently avoid forming memcpy under -fno-builtin (mostly to avoid weird issues with implementing memcpy).  This doesn't affect clang.

Hmm, which cases do you refer to here?

>>> If the requirement of having a symbol named "memcpy" is problematic, I guess we could look into making compiler-rt provide a symbol with equivalent semantics, but use a compiler-reserved name so the user can't disable it with -fno-builtin. For example, __aeabi_memcpy is defined on ARM.
>>
>> Actually, the issue isn't that memcpy is missing, but that it has a different calling convention. In wine, certain DLLs (which are built as a regular ELF .so or MachO .dylib) would only include windows mode headers, and use a memcpy function with windows calling conventions. As long as memcpy is called explicitly, it ends up fine, but for the cases when the compiler backend invents memcpy calls, they end up using the platform's default calling convention.
>>
>> This is mostly an issue on x86_64, where the calling conventions differ significantly - on other architectures the calling conventions are similar enough that there's no difference in a memcpy call.
>
> You mean, the header explicitly declares a function named memcpy, and uses an attribute to explicitly overrides the calling convention so it's not the same as the "C" calling convention?

Exactly. I guess that's maybe technically invalid C, but still a scenario that there's a demand to handle in some way.

> Messing with the backend's calling convention for memcpy isn't hard, but I'm not sure what conditions would be appropriate.  We already support `clang --target=x86_64-pc-win32-elf`.

I fear that tucking the decision away behind a target triple can be a bit opaque. This is also a transition in progress - 12 months ago, the memcpy called within Wine builtin modules was the host libc's, but they're transitioning towards making the wine builtin modules actual freestanding DLLs. They can still either be built as ELF/MachO, or as a real PE DLL, but they don't interact with the host environment, only other DLLs, and the memcpy used within them is one with the windows calling convention regardless of the object file format.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86025/new/

https://reviews.llvm.org/D86025



More information about the llvm-commits mailing list