[libc-commits] [libc] [libc] Change default behaviour of baremetal/printf to use stdout (PR #143703)

via libc-commits libc-commits at lists.llvm.org
Thu Jun 12 02:09:54 PDT 2025


saturn691 wrote:

Hi Petr,

Sorry for the long read, but to answer your question, TL;DR, it's faster.

---

In LLVM-libc, as far as I can see, we try to prefer `static constexpr` whenever possible. So I did a little investigation, to see whether it would make a difference. Here is a test case that mimics what the function does.

Test case:
```cpp
// test1.cpp
#include <string.h>

int f()
{
  static constexpr int SIZE = 1024;
  char buffer[SIZE];
  memset(buffer, 1, SIZE);

  return buffer[0];
}

int main()
{
  return f() != 1;
}
```

I have created an identical test case, `test2.cpp` without `static`. Note I have built clang from `0eeabd4b302cf52c4a585664ed9bc4a81ef91105`. Also note I do not care about higher levels of optimisation, as it will optimise `f` out, which could be seen in the disassembly.

```bash
❯ clang test1.cpp -O0 -o test1
❯ clang test2.cpp -O0 -o test2
❯ perf stat -r 10000 ./test1
# Details omitted but it takes 0.000362472 +- 0.000000342 seconds (time elapsed)
❯ perf stat ./test2
# Details omitted but it takes 0.000363506 +- 0.000000338 seconds (time elapsed)
```

Which is almost identical. So we look at the disassembly:

```
# test1
❯ llvm-objdump -D test1 | grep -A12 "fv>" 
0000000000001140 <_Z1fv>:
    1140: 55                           	        pushq	%rbp
    1141: 48 89 e5                     	movq	%rsp, %rbp
    1144: 48 81 ec 00 04 00 00       subq	$0x400, %rsp            # imm = 0x400
    114b: 48 8d bd 00 fc ff ff         	leaq	        -0x400(%rbp), %rdi
    1152: be 01 00 00 00               	movl	$0x1, %esi
    1157: ba 00 04 00 00               	movl	$0x400, %edx            # imm = 0x400
    115c: e8 cf fe ff ff               	        callq	         0x1030 <memset at plt>
    ...
    
# test2
0000000000001140 <_Z1fv>:
    1140: 55                           	        pushq	%rbp
    1141: 48 89 e5                     	movq	%rsp, %rbp
    1144: 48 81 ec 10 04 00 00        subq	$0x410, %rsp            # imm = 0x410
    114b: c7 45 fc 00 04 00 00        movl	$0x400, -0x4(%rbp)      # imm = 0x400
    1152: 48 8d bd f0 fb ff ff         	leaq	        -0x410(%rbp), %rdi
    1159: be 01 00 00 00               	movl	$0x1, %esi
    115e: ba 00 04 00 00               	movl	$0x400, %edx            # imm = 0x400
    1163: e8 c8 fe ff ff               	callq	0x1030 <memset at plt>
    ...
```

Notice the non-static variable requires extra stack space (0x410 vs 0x400) and an extra instruction (movl at 114b). This is consistent with the observation here: https://stackoverflow.com/questions/70711580/static-constexpr-vs-constexpr-in-function-body

You're right at higher levels of optimisation, we get the same thing. But I think it's worth the change, even if it's for good practice.

https://github.com/llvm/llvm-project/pull/143703


More information about the libc-commits mailing list