[clang] [clang] Optimize castToDeclContext for 2% improvement in build times (PR #76825)
Pol M via cfe-commits
cfe-commits at lists.llvm.org
Thu Jan 4 00:58:19 PST 2024
Destroyerrrocket wrote:
The reason for puting the classes with DeclContext closer together is to achieve a better code generation. Here's an example of the asssembly with just the removal of the DECL_CONTEXT_BASE macro:
```
clang::Decl::castToDeclContext(clang::Decl const*): # @clang::Decl::castToDeclContext(clang::Decl const*)
.L_ZN5clang4Decl17castToDeclContextEPKS0_$local:
movl 28(%rdi), %edx
leaq .LJTI66_0(%rip), %rsi
movq %rdi, %rax
movl $40, %ecx
andl $127, %edx
decl %edx
movslq (%rsi,%rdx,4), %rdx
addq %rsi, %rdx
jmpq *%rdx
.LBB66_5:
movl $48, %ecx
addq %rcx, %rax
retq
.LBB66_1:
addq %rcx, %rax
retq
.LBB66_2:
movl $72, %ecx
addq %rcx, %rax
retq
.LBB66_6:
movl $64, %ecx
addq %rcx, %rax
retq
.LBB66_7:
movl $56, %ecx
addq %rcx, %rax
retq
.LBB66_8:
.LJTI66_0:
<jump table>
```
While this is better, it is still not performing ideally compared to just the load and an add of the final implementation. I'm of course open to suggestions if there is a better way of getting llvm to emit the right thing.
I agree with @cor3ntin that a comment explaining the need for this order priorization is needed.
https://github.com/llvm/llvm-project/pull/76825
More information about the cfe-commits
mailing list