[cfe-dev] Optimizing vcalls from structors and virtual this-adjusting thunks

Reid Kleckner rnk at google.com
Fri Nov 8 13:12:00 PST 2013


On Thu, Nov 7, 2013 at 7:43 AM, Timur Iskhodzhanov <timurrrr at google.com>wrote:

> Hi John,
>
> I've noticed Clang doesn't devirtualize all vcalls in ctors/dtors.
>
> e.g. for this code:
> --------------------------
> struct A { virtual void a(); };
> struct B { virtual void b(); };
> struct C : virtual A, virtual B {
>   C();
>   virtual void key_function();
>   virtual void a();
>   virtual void b();
> };
>
> C::C() { a(); b(); }
> void C::key_function() {}
> --------------------------
> the assembly for C::C() at -O3 is
> --------------------------
> _ZN1CC1Ev:  # complete ctor
>         pushq   %rbx
>         movq    %rdi, %rbx
>         movq    $_ZTV1C+40, (%rbx)
>         movq    $_ZTV1C+88, 8(%rbx)
>         callq   _ZN1C1aEv  # call to C::a is devirtualized
>         movq    (%rbx), %rax
>         movq    %rbx, %rdi
>         popq    %rbx
>         jmpq    *16(%rax)  # call to C::b is not!
>

This looks like it was just standard LLVM optimizations forwarding the vptr
store and evaluating the load from the constant global.  Because C::a is
external, we think it could have modified the vptr, so we fail to
devirtualize b.


> _ZN1CC2Ev:  # base ctor
>         pushq   %rbx
>         movq    %rdi, %rbx
>         movq    (%rsi), %rax
>         movq    %rax, (%rbx)
>         movq    8(%rsi), %rcx
>         movq    -32(%rax), %rax
>         movq    %rcx, (%rbx,%rax)
>         movq    16(%rsi), %rax
>         movq    (%rbx), %rcx
>         movq    -40(%rcx), %rcx
>         movq    %rax, (%rbx,%rcx)
>         movq    (%rbx), %rax
>         callq   *(%rax)   # looks like even C::a is not devirtualized
>         movq    (%rbx), %rax
>         movq    %rbx, %rdi
>         popq    %rbx
>         jmpq    *16(%rax)  # call C::b is not devirtualized
>

I don't fully understand how VTTs are supposed to work here, but it looks
like we don't have a vptr store to forward, so LLVM can't devirtualize.  It
would have to be a clang IRGen optimization.

Nick sent some patches to try to teach LLVM that the vptr is usually
constant across most calls, but they failed to handle certain corner cases
involving placement new that John raised.

 --------------------------
> The same pattern holds if I define C::C() as "b(); a();" - only the
> first vcall in the complete ctor is devirtualized.
>
> Does this look like a bug to you?
> GCC devirtualizes all four calls in this example...
>
> I also have a somewhat related ABI question.
> Is there any reason to keep virtual this-adjusting thunks in the
> vtable when the class is fully constructed?
> I think all the offsets between bases are known statically at the end
> of the complete object constructor, so a special "final vtable" with
> only static this adjusting thunks can be used instead of a regular
> vtable?
> Am I missing something?
>




> --
> Thanks,
> Timur
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20131108/0673cc11/attachment.html>


More information about the cfe-dev mailing list