[libc-commits] [libc] [libc] Change default behaviour of baremetal/printf to use stdout (PR #143703)
via libc-commits
libc-commits at lists.llvm.org
Thu Jun 12 02:09:54 PDT 2025
saturn691 wrote:
Hi Petr,
Sorry for the long read, but to answer your question, TL;DR, it's faster.
---
In LLVM-libc, as far as I can see, we try to prefer `static constexpr` whenever possible. So I did a little investigation, to see whether it would make a difference. Here is a test case that mimics what the function does.
Test case:
```cpp
// test1.cpp
#include <string.h>
int f()
{
static constexpr int SIZE = 1024;
char buffer[SIZE];
memset(buffer, 1, SIZE);
return buffer[0];
}
int main()
{
return f() != 1;
}
```
I have created an identical test case, `test2.cpp` without `static`. Note I have built clang from `0eeabd4b302cf52c4a585664ed9bc4a81ef91105`. Also note I do not care about higher levels of optimisation, as it will optimise `f` out, which could be seen in the disassembly.
```bash
❯ clang test1.cpp -O0 -o test1
❯ clang test2.cpp -O0 -o test2
❯ perf stat -r 10000 ./test1
# Details omitted but it takes 0.000362472 +- 0.000000342 seconds (time elapsed)
❯ perf stat ./test2
# Details omitted but it takes 0.000363506 +- 0.000000338 seconds (time elapsed)
```
Which is almost identical. So we look at the disassembly:
```
# test1
❯ llvm-objdump -D test1 | grep -A12 "fv>"
0000000000001140 <_Z1fv>:
1140: 55 pushq %rbp
1141: 48 89 e5 movq %rsp, %rbp
1144: 48 81 ec 00 04 00 00 subq $0x400, %rsp # imm = 0x400
114b: 48 8d bd 00 fc ff ff leaq -0x400(%rbp), %rdi
1152: be 01 00 00 00 movl $0x1, %esi
1157: ba 00 04 00 00 movl $0x400, %edx # imm = 0x400
115c: e8 cf fe ff ff callq 0x1030 <memset at plt>
...
# test2
0000000000001140 <_Z1fv>:
1140: 55 pushq %rbp
1141: 48 89 e5 movq %rsp, %rbp
1144: 48 81 ec 10 04 00 00 subq $0x410, %rsp # imm = 0x410
114b: c7 45 fc 00 04 00 00 movl $0x400, -0x4(%rbp) # imm = 0x400
1152: 48 8d bd f0 fb ff ff leaq -0x410(%rbp), %rdi
1159: be 01 00 00 00 movl $0x1, %esi
115e: ba 00 04 00 00 movl $0x400, %edx # imm = 0x400
1163: e8 c8 fe ff ff callq 0x1030 <memset at plt>
...
```
Notice the non-static variable requires extra stack space (0x410 vs 0x400) and an extra instruction (movl at 114b). This is consistent with the observation here: https://stackoverflow.com/questions/70711580/static-constexpr-vs-constexpr-in-function-body
You're right at higher levels of optimisation, we get the same thing. But I think it's worth the change, even if it's for good practice.
https://github.com/llvm/llvm-project/pull/143703
More information about the libc-commits
mailing list