[PATCH] Add MicrosoftVFTableContext to AST

Thu Jul 25 10:20:47 PDT 2013

Thanks a lot for a thorough review!
Please have a look at the updated version.

2013/7/24 John McCall <rjmccall at apple.com>:
>> It seems like MethodInfoMap is basically trying to track the current
>> class's notion of the final overriders of all the methods in its primary
>> vf-table?
>
>
>> Not in the primary vftable but rather in the vftable it's building
>> (e.g. first vbase's votable).
>
>
>> By "current class”, I meant the class for which you are currently adding
>> methods.
>
>> Other than that - yes.
>> I think my MethodInfoMap is similar to MethodInfoMap from
>> VTableBuilder::AddMethods.
>> Please note that using FinalOverriders alone would not suffice due to
>> a different "this parameter" logic in this ABI.
>
>
>> Right, but all that is is a mapping of methods to methods from a
>> different base subobject.
>
>
>> Can you please clarify if I should add any comments for this into the code?
>
>
> I think putting a comment on the recursive method stating what you expect
> to be in MethodInfoMap after each iteration would make a lot of sense.

OK, I've added more comments to the AddMethods code.

2013/7/25 John McCall <rjmccall at gmail.com>:
> On Jul 22, 2013, at 6:11 AM, Timur Iskhodzhanov <timurrrr at google.com> wrote:
>
>  While working on CodeGen, finally realized why we should only put the
> overriders from the most derived class into MethodVFTableLocations
>
>
> --- lib/AST/MicrosoftMangle.cpp
> +++ lib/AST/MicrosoftMangle.cpp
> @@ -89,6 +89,8 @@
>    void mangleNumber(const llvm::APSInt &Value);
>    void mangleType(QualType T, SourceRange Range,
>                    QualifierMangleMode QMM = QMM_Mangle);
> +  void mangleFunctionType(const FunctionType *T, const FunctionDecl *D,
> +                          bool IsStructor, bool IsInstMethod);
>
> Is there a reason the mangling changes are part of this patch?

Yes - the still-used Itanium VTableBuilder tries to emit thunks for
its vtables from CreateVTableInitialize/GetAddrOfThunk.

> +// The main differences are:
> +//  1. Separate vftable and vbtable.
> +//  2. Each non-primary base class that has new virtual methods gets its
> +//     own vfptr and vftable, all the address points are zero.
>
> This is not a difference.

As far as I understand, in Itanium ABI each subobject that adds new
virtual methods to its bases gets a new *address point* in the shared
vtable, but not a new vtable (at least no new sections are generated),
which IS different from the Microsoft ABI.

I've dropped "vfptr and" to emphasize "own vftable".

> Other than the absence of virtual base
> subobjects,
> the layout of a class is necessarily independent of whether it’s a subobject
> or not — otherwise you wouldn’t be able to convert a derived pointer to a
> base pointer and use the object.

Sure.

> +//  3. Rather than using the most derived class as the type of the hidden
> 'this'
> +//     method parameter, it tries to stick with the original 'this' type as
> +//     long as possible.
> +//     This yields the following consequences:
> +//     - Don't need to add all overriders to the vftables of their parents;
> +//     - Don't need adjusting thunks as long as there are no methods
> overriding
> +//       multiple methods at the same time.
>
> Again, this has things a little backwards.  You can *never* change the
> expected
> address point of an existing entry in a v-table, or else you wouldn’t be
> able
> to use the entry correctly from a base subobject pointer.

I don't see any conflict with my comment here.
Can you please suggest your variant of this clause?

> The differences here are:
>
> 1. Virtual method definitions do not necessarily expect ‘this’ to be at the
> address point of the defining class.  The address point may be of some
> base sub-object that provides the same method.  (You should also
> describe the algorithm for choosing this base sub-object.)
>
> Crucially, this base sub-object can be in a virtual base, in which case the
> definition hard-codes the appropriate offset from the implementing class.
> Of course, if the implementing class isn’t the most-derived class, that
> might
> not be the right offset for the real most-derived class, in which case the
> vf-table entry has to be a thunk which adjusts to the arbitrary offset
> before calling the definition.
>
> 2. Overrides do not introduce new vf-table entries just because their
> ‘this’ pointer is at a different offset.  This is what reduces the number of
> thunks required (except in the virtual-inheritance case described above).
>
> +//  4. At entry, the 'this' parameter always points to the vfptr used
> +//     for the vftable lookup.
>
> This is also not a difference.

It is different in these ABIs.

Probably we're using different terms, so let's consider this example:
-----------
  class A {
    virtual void f() {}
  };

  class B : virtual A {
    virtual void f();
  };

  void foo(void*);

  void B::f() {
    foo(this);
  }
-----------
In the Itanium ABI, B::f expects the top of the stack to hold (B*)this at entry:
-----------
  __ZN1B1fEv:
        jmp     __Z3fooPv               # TAILCALL
-----------
thus needing a thunk in the vtable.

In Microsoft ABI, B::f expects ECX to hold (A*)this and adjusts in the preamble.
-----------
  "?f at B@@EAEXXZ":
        pushl   %eax
        addl    $-4, %ecx    # Adjust (A*) to (B*)
        movl    %ecx, (%esp)
        calll   "?foo@@YAXPAX at Z"
        popl    %eax
        ret
-----------
thus a thunk is not needed.

I've rephrased it to "at virtual function entry" - does this make sense now?

> +//  6. Instead of VTT and constructor vtables, vtordisps are emitted into
> the
> +//     class layout if a class has
> +//      a) any ctor/dtor
>
> A user-defined ctor/dtor.

Good catch, fixed!

> +  OverriddenMethodsSetTy OverriddenMethods;
> +  ComputeAllOverriddenMethods(MD, OverriddenMethods);
> +
> +  // List all the bases that declare but not override this method.
> +  BasesSetVectorTy OverridenBases;
> +  for (OverriddenMethodsSetTy::const_iterator I =
> OverriddenMethods.begin(),
> +       E = OverriddenMethods.end(); I != E; ++I) {
> +    const CXXMethodDecl *MD = (*I);
> +    if (MD->size_overridden_methods() == 0)
> +      OverridenBases.insert(MD->getParent());
> +  }
>
> Please refactor ComputeAllOverriddenMethods to something like this:
>
>   /// Visit all the methods overridden by the given method, recursively, in
> a depth-first pre-order.
>   template <class VisitorTy>
>   void visitAllOverriddenMethods(const CXXMethodDecl *method, VisitorTy
> &visitor) {
>     for (const CXXMethodDecl *overridden : method->getOverrides()) { //
> pseudocode
>       // Short-circuit if the visitor or recursive visitation
>       visitor.visit(overridden);
>       visitAllOverriddenMethods(overridden, visitor);
>     }
>   }
>
> The changes to your use and the existing use should be obvious.

Phew, actually changing the FindNearestOverriddenMethod code turned
out not to be as trivial, at least I can't justify the change to
myself...

Please take a look at the code - I have several ideas how to change
the code there, would be glad if you can suggest which one looks
better to you.

> +  // This implies that an offset of a non-virtual base will dominate an
> offset
> +  // of a virtual base.
>
> This comment should say why this is important.  (You don’t rely on it in
> the algorithm; it's just an ABI design advantage that potentially reduces
> the number of this-adjustment thunks in derived classes that inherit this
> method.)

Good point, added this to the comment.

> +  // See if this class expands a vftable of the base we look at, which is
> either
> +  // the one defined by the vfptr base path or the primary base.
>
> Still not sure why you’re not just starting from the right base and then
> walking up the primary base chain.

That's not enough even in the simple case of "B: A".

The vfptr is in the A layout, so the "right base" is A.
If we don't go to the most derived class (B) from "the right base"
(A), we forget to add the more derived class's own new methods (and
probably return-adjusting thunks).

> +    ThisAdjustment ThisAdjustmentOffset;
> +    // Check if this virtual member function overrides
> +    // a method in one of the visited bases.
> +    if (const CXXMethodDecl *OverriddenMD =
> +            FindNearestOverriddenMethod(MD, VisitedBases)) {
>
> I don’t think you need to compute the nearest overridden method.
> The way MSVC does this, you only get an entry if the method doesn’t
> override a method with a trivial (or no) conversion on the return type.
> So you can just walk the directly-overridden methods

Good point, change the "iterate over ALL overridden methods" to
"iterate over directly overridden methods", though this has spawned
some more duplicate code. Is that OK now?

> and do a pretty cheap check on the return type, and if you don’t
> find a match, you need an entry.

Please have a look at the new version - I think it doesn't need this check.

I don't think it's possible to do a for loop and break/continue if we
need a return adjustment, at least without making the return thunk
handling code more complex.

> +        // In case we need a return adjustment, add a new slot for each
> +        // return-adjusting thunk.
>
> No.  The thunks go in the original slots; you’re not adding them here.
>
> Whenever you add an entry for a method, you go find the final overrider,
> and then you decide what needs to go in the slot in this concrete vf-table.

D'oh! Sure, fixed.

> John.

Looking forward for the new round of review!
Thanks,
Timur