[llvm-dev] [RFC][clang/llvm] Allow efficient implementation of libc's memory functions in C/C++

Guillaume Chatelet via llvm-dev llvm-dev at lists.llvm.org
Fri Apr 26 04:47:22 PDT 2019


*TL;DR:*
Defining memory functions in C / C++ results in a chicken and egg problem.
Clang can mutate the code into semantically equivalent calls to libc. None
of `-fno-builtin-memcpy`, `-ffreestanding` nor `-nostdlib` provide a
satisfactory answer to the problem.

*Goal*
Create libc's memory functions (aka `memcpy`, `memset`, `memcmp`, ...) in
C++ to benefit from compiler's knowledge and profile guided optimizations.

*Current state*
LLVM is allowed to replace a piece of code that looks like a memcpy with an
IR intrinsic that implements the same semantic, namely `call void
@llvm.memcpy.p0i8.p0i8.i64` (e.g. https://godbolt.org/z/0y1Yqh).

This is a problem when designing a libc's memory function as the compiler
may choose to replace the implementation with a call to itself (e.g.
https://godbolt.org/z/eg0p_E)

Using `-fno-builtin-memcpy` prevents the compiler from understanding that
an expression has memory copy semantic, effectively removing `@llvm.memcpy`
at the IR level : https://godbolt.org/z/lnCIIh. In this specific example,
the vectorizer kicks in and the generated code is quite good. Unfortunately
this is not always the case: https://godbolt.org/z/mHpAYe.

In addition `-fno-builtin-memcpy` prevents the compiler from understanding
that a piece of code has the memory copy semantic but does not prevent the
compiler from generating calls to libc's `memcpy`, for instance:
Using `__builtin_memcpy`: https://godbolt.org/z/O0sjIl
Passing big structs by value: https://godbolt.org/z/4BUDc0

In both cases, the generated `@llvm.memcpy` IR intrinsic is lowered into a
libc `memcpy` call.

We would like to use `__builtin_memcpy` to communicate the semantic to the
compiler but prevent it from generating calls to the libc.

One could argue that this is the purpose of `-ffreestanding` but the
standard leaves a lot of freestanding requirements implementation defined (
see https://en.cppreference.com/w/cpp/freestanding ).

In practice, making sure that `-ffreestanding` never calls libc memory
functions will probably do more harm than good. People using
`-ffreestanding` are now expecting the compiler to call these functions,
inlining bloat can be problematic for the embedded world ( see comments in
https://reviews.llvm.org/D60719 )

*Proposals*
We envision two approaches: an *attribute to prevent the compiler from
synthesizing calls* or a *set of builtins* to communicate the intent more
precisely to the compiler.

  1. A function/module attribute to disable synthesis of calls

    1.1 A specific attribute to disable the synthesis of a single call
__attribute__((disable_call_synthesis("memcpy")))
Question: Is it possible to specify the attribute several times on a
function to disable many calls?

    1.2 A specific attribute to disable synthesis of all libc calls
__attribute__((disable_libc_call_synthesis))
With this one we are losing precision and we may inline too much. There is
also the question of what is considered a libc function, LLVM mainly
defines target library calls.

    1.3 Stretch - a specific attribute to redirect a single synthesizable
function.
This one would help explore the impact of replacing a synthesized function
call with another function but is not strictly required to solve the
problem at hand.
__attribute__((redirect_synthesized_calls("memcpy", "my_memcpy")))

  2. A set of builtins in clang to communicate the intent clearly

__builtin_memcpy_alwaysinline(...)
__builtin_memmove_alwaysinline(...)
__builtin_memset_alwaysinline(...)

To achieve this we may have to provide new IR builtins (e.g.
`@llvm.alwaysinline_memcpy`) which can be a lot of work.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190426/1ecade18/attachment.html>


More information about the llvm-dev mailing list