[PATCH] D74162: [Inliner] Inlining should honor nobuiltin attributes

Guillaume Chatelet via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jul 20 04:33:26 PDT 2023


gchatelet added a subscriber: sivachandra.
gchatelet added a comment.

In D74162#4514856 <https://reviews.llvm.org/D74162#4514856>, @jhuber6 wrote:

> In D74162#4514824 <https://reviews.llvm.org/D74162#4514824>, @tejohnson wrote:
>
>> In D74162#4508033 <https://reviews.llvm.org/D74162#4508033>, @jhuber6 wrote:
>>
>>> I was discussing this patch with @arsenm after investigating why the calls to `libc` functions weren't being inlined on the GPU. What was the motivation for the original change? I figured it would make no difference if we inlined a function that doesn't allow built-ins into one that does. This is problematic because it prevents us from inlining any function from the LLVM C library under LTO. I'd like a way to at least work around this in the short term, but I'd like some context for why this was introduced.
>>
>> @gchatelet to add more context as this was related to work he was doing for custom implementations of libc memcpy functions.
>>
>> Context is in D61634 <https://reviews.llvm.org/D61634>. See specifically the discussion from https://reviews.llvm.org/D61634#1502201 to https://reviews.llvm.org/D61634#1512020 which discusses the inliner changes for the new attribute. IIRC the issue is that allowing such inlining would lose the no-builtin attributes, which would defeat their purpose.
>
> Losing the ability to inline any function implemented in an LTO library compiled with `-ffreestanding` seems like a very bad tradeoff.  I was talking with @arsenm about this and the attribute seems undocumented and sparsely tested. What was the specific failure case that this was introduced to solve? We lose that attribute, but is that a bad thing? If we inline a function that cannot call builtins into a function that can, what is the issue with calling builtins at that point?

Adding @sivachandra for the LLVM libc part.

The gist of the issue is described in the RFC <https://lists.llvm.org/pipermail/llvm-dev/2019-April/131973.html>.

A few considerations first:

- LLVM libc is not written in assembly because we want compiler support for optimizations, sanitizers and fuzzers.
- Compiler is able to recognize C/C++ constructs and turn them into library calls unless we use the dedicated `-fno-builtin-*` compiler flags.
- `-ffreestanding` implies `-fno-builtins` (i.e., all builtins).

Now let's consider the case of a `foo` function calling `memcpy` where we allow inlining of functions with different attributes.
`foo` is compiled without any of `-ffreestanding`/`-fno-builtin-*`.
During LTO, the compiler decides to inline all of the code of `memcpy` within `foo`, discarding `memcpy`'s `-fno-builtin-memcpy` flag (stored as a function attribute).
Deep down `memcpy`'s implementation the compiler recognizes a copy loop and turns it into the `@llvm.memcpy` IR intrinsic.
The backend decides that this intrinsic is best implemented by calling libc's `memcpy` and inserts a call.
Now `memcpy` is implemented by a call to itself and will eventually stack overflow or infinite loop.
This happened in production.

Now for `memcpy` in particular we have introduced `__builtin_memcpy_inline` (doc <https://clang.llvm.org/docs/LanguageExtensions.html#guaranteed-inlined-copy>) which allows us to convey the semantics of "fixed size operations that always generate code". They are guaranteed to never "call to the libc" (godbolt <https://godbolt.org/z/4f8M9zqq7>). Unfortunately those intrinsics are not supported in GCC so this would be clang only. Also they don't yet cover `memcmp` and `bcmp`.

Today only `memcpy` relies completely on `__builtin_memcpy_inline` (when compiled with clang) so technically this function could be compiled without `-ffreestanding` nor `-fno-builtin` and thus inlined during LTO.

Now it is very desirable that the libc functions can be inlined through LTO, we may be able to reach this state when we have strong guarantees that the compiler will not generate libc calls from within the libc function themselves. We're there for `memcpy` which really is the most fundamental libc function. But we're not there for most of the libc.

Note: for the reasons described above we had to implement the `no-builtin` flags as function attributes so they can stick to code even during LTO.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74162/new/

https://reviews.llvm.org/D74162



More information about the llvm-commits mailing list