[llvm] [RISCV][MC] Implement evaluateBranch for auipc+jalr pairs (PR #65480)
Job Noorman via llvm-commits
llvm-commits at lists.llvm.org
Mon Oct 2 10:57:25 PDT 2023
mtvec wrote:
> In general this looks good to me but I do worry a bit about the performance implications of regularly zeroing all the GPRS. Could you show the impact on disassembly of a large binary (e.g. statically linked clang for riscv?)
Here's a quick benchmark (release build without asserts):
```
$ file clang
clang: ELF 64-bit LSB pie executable, UCB RISC-V, RVC, double-float ABI, version 1 (GNU/Linux), dynamically linked, interpreter /lib/ld-linux-riscv64-lp64d.so.1, BuildID[sha1]=a4645a5d30617084df5efea9662d984d0a9dc918, for GNU/Linux 4.15.0, not stripped
$ size clang
text data bss dec hex filename
149308819 4260800 622672 154192291 930c9a3 clang
$ hyperfine --parameter-list which main,pr './llvm-objdump.{which} -d clang > /dev/null' --warmup 3
Benchmark 1: ./llvm-objdump.main -d clang > /dev/null
Time (mean ± σ): 56.003 s ± 0.206 s [User: 32.285 s, System: 23.695 s]
Range (min … max): 55.849 s … 56.511 s 10 runs
Benchmark 2: ./llvm-objdump.pr -d clang > /dev/null
Time (mean ± σ): 56.651 s ± 0.071 s [User: 32.911 s, System: 23.713 s]
Range (min … max): 56.550 s … 56.797 s 10 runs
Summary
./llvm-objdump.main -d clang > /dev/null ran
1.01 ± 0.00 times faster than ./llvm-objdump.pr -d clang > /dev/null
```
So there seems to be about 1% overhead.
If this is too much, one solution would be to not store an array of `std::optional<uint64_t>` but one containing just `uint64_t` and a separate 32-bit bitmap. I suppose that would remove most of the overhead of clearing state.
https://github.com/llvm/llvm-project/pull/65480
More information about the llvm-commits
mailing list