[clang] [clang] Optimize castToDeclContext for 2% improvement in build times (PR #76825)

Pol M via cfe-commits cfe-commits at lists.llvm.org
Thu Jan 4 00:58:19 PST 2024


Destroyerrrocket wrote:

The reason for puting the classes with DeclContext closer together is to achieve a better code generation. Here's an example of the asssembly with just the removal of the DECL_CONTEXT_BASE macro:
```
clang::Decl::castToDeclContext(clang::Decl const*): # @clang::Decl::castToDeclContext(clang::Decl const*)
.L_ZN5clang4Decl17castToDeclContextEPKS0_$local:
  movl 28(%rdi), %edx
  leaq .LJTI66_0(%rip), %rsi
  movq %rdi, %rax
  movl $40, %ecx
  andl $127, %edx
  decl %edx
  movslq (%rsi,%rdx,4), %rdx
  addq %rsi, %rdx
  jmpq *%rdx
.LBB66_5:
  movl $48, %ecx
  addq %rcx, %rax
  retq
.LBB66_1:
  addq %rcx, %rax
  retq
.LBB66_2:
  movl $72, %ecx
  addq %rcx, %rax
  retq
.LBB66_6:
  movl $64, %ecx
  addq %rcx, %rax
  retq
.LBB66_7:
  movl $56, %ecx
  addq %rcx, %rax
  retq
.LBB66_8:
.LJTI66_0:
<jump table>
```
While this is better, it is still not performing ideally compared to just the load and an add of the final implementation. I'm of course open to suggestions if there is a better way of getting llvm to emit the right thing.

I agree with @cor3ntin that a comment explaining the need for this order priorization is needed.

https://github.com/llvm/llvm-project/pull/76825


More information about the cfe-commits mailing list