[libc-commits] [libc] [libc] Remove unnecessary call in memfunction dispatchers (PR #75800)
Guillaume Chatelet via libc-commits
libc-commits at lists.llvm.org
Mon Dec 18 07:57:05 PST 2023
================
@@ -32,7 +32,8 @@
namespace LIBC_NAMESPACE {
-LIBC_INLINE int inline_bcmp(const void *p1, const void *p2, size_t count) {
+__attribute__((flatten)) LIBC_INLINE int
----------------
gchatelet wrote:
AFAIU `flatten` is not recursive it only applies to the selected function.
Right now we have the function definition [1] (example with `memcmp`)
https://github.com/llvm/llvm-project/blob/8a233d8cfde4cd026b5dd12b56ea029749a02329/libc/src/string/memcmp.cpp#L16-L19
and the helper that delegates to real code [2]
https://github.com/llvm/llvm-project/blob/8a233d8cfde4cd026b5dd12b56ea029749a02329/libc/src/string/memory_utils/inline_memcmp.h#L36-L39
Out of the box, clang performs inlining of [2] in [1], but not the inlining of the [2]'s body leaving `memcmp` with
```
00000000000c69c0 <memcmp>:
c69c0: e9 1b 00 00 00 jmp 0xc69e0 <__llvm_libc::inline_memcmp_x86(__llvm_libc::cpp::byte const*, __llvm_libc::cpp::byte const*, unsigned long)>
...
```
Adding `[[gnu::flatten]]` to [1] makes no difference, adding `[[gnu::flatten]]` to [2] does inline the function in `memcmp`:
```
00000000000c6980 <memcmp>:
c6980: 31 c0 xorl %eax, %eax
c6982: 48 83 fa 08 cmpq $0x8, %rdx
c6986: 0f 87 e4 00 00 00 ja 0xc6a70 <memcmp+0xf0>
c698c: 48 8d 0d 9d e1 f6 ff leaq -0x91e63(%rip), %rcx
c6993: 48 63 14 91 movslq (%rcx,%rdx,4), %rdx
...
```
Now I'm pretty sure that PGO will inline the call but there's no point in having an implementation of a memory function that only jumps to another function in the first place.
I can go with `[[always_inline]]` if you prefer although `[[gnu::flatten]]` has the added benefit that it will respect `[[noinline]]`.
https://github.com/llvm/llvm-project/pull/75800
More information about the libc-commits
mailing list