[llvm-dev] [RFC][clang/llvm] Allow efficient implementation of libc's memory functions in C/C++
Guillaume Chatelet via llvm-dev
llvm-dev at lists.llvm.org
Fri Apr 26 04:47:22 PDT 2019
Defining memory functions in C / C++ results in a chicken and egg problem.
Clang can mutate the code into semantically equivalent calls to libc. None
of `-fno-builtin-memcpy`, `-ffreestanding` nor `-nostdlib` provide a
satisfactory answer to the problem.
Create libc's memory functions (aka `memcpy`, `memset`, `memcmp`, ...) in
C++ to benefit from compiler's knowledge and profile guided optimizations.
LLVM is allowed to replace a piece of code that looks like a memcpy with an
IR intrinsic that implements the same semantic, namely `call void
@llvm.memcpy.p0i8.p0i8.i64` (e.g. https://godbolt.org/z/0y1Yqh).
This is a problem when designing a libc's memory function as the compiler
may choose to replace the implementation with a call to itself (e.g.
Using `-fno-builtin-memcpy` prevents the compiler from understanding that
an expression has memory copy semantic, effectively removing `@llvm.memcpy`
at the IR level : https://godbolt.org/z/lnCIIh. In this specific example,
the vectorizer kicks in and the generated code is quite good. Unfortunately
this is not always the case: https://godbolt.org/z/mHpAYe.
In addition `-fno-builtin-memcpy` prevents the compiler from understanding
that a piece of code has the memory copy semantic but does not prevent the
compiler from generating calls to libc's `memcpy`, for instance:
Using `__builtin_memcpy`: https://godbolt.org/z/O0sjIl
Passing big structs by value: https://godbolt.org/z/4BUDc0
In both cases, the generated `@llvm.memcpy` IR intrinsic is lowered into a
libc `memcpy` call.
We would like to use `__builtin_memcpy` to communicate the semantic to the
compiler but prevent it from generating calls to the libc.
One could argue that this is the purpose of `-ffreestanding` but the
standard leaves a lot of freestanding requirements implementation defined (
see https://en.cppreference.com/w/cpp/freestanding ).
In practice, making sure that `-ffreestanding` never calls libc memory
functions will probably do more harm than good. People using
`-ffreestanding` are now expecting the compiler to call these functions,
inlining bloat can be problematic for the embedded world ( see comments in
We envision two approaches: an *attribute to prevent the compiler from
synthesizing calls* or a *set of builtins* to communicate the intent more
precisely to the compiler.
1. A function/module attribute to disable synthesis of calls
1.1 A specific attribute to disable the synthesis of a single call
Question: Is it possible to specify the attribute several times on a
function to disable many calls?
1.2 A specific attribute to disable synthesis of all libc calls
With this one we are losing precision and we may inline too much. There is
also the question of what is considered a libc function, LLVM mainly
defines target library calls.
1.3 Stretch - a specific attribute to redirect a single synthesizable
This one would help explore the impact of replacing a synthesized function
call with another function but is not strictly required to solve the
problem at hand.
2. A set of builtins in clang to communicate the intent clearly
To achieve this we may have to provide new IR builtins (e.g.
`@llvm.alwaysinline_memcpy`) which can be a lot of work.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev