[PATCH] [ms-cxxabi] Implement codegen for dereferencing member pointers

John McCall rjmccall at apple.com
Tue Apr 9 16:50:26 PDT 2013


On Apr 9, 2013, at 3:02 PM, Reid Kleckner <rnk at google.com> wrote:
> On Mon, Apr 8, 2013 at 7:58 PM, John McCall <rjmccall at apple.com> wrote:
>> 
>> On Apr 2, 2013, at 11:00 AM, Reid Kleckner <rnk at google.com> wrote:
>>> Handles all inheritance models for both data and function member
>>> pointers.  This does not require knowing the layout of the vbtable,
>>> because the member pointer itself carries the offset into the vbtable.
>>> 
>>> Clang still cannot emit multi-field member pointers because it would
>>> need to know the layout of the vbtable to generate the correct offsets.
>>> 
>>> Also implements isZeroInitializable() and refactors some of the null
>>> member pointer code.
>> 
>> +  llvm::tie(Ptrs, Ints) = MPT->getMSMemberPointerSlots();
>> 
>> I missed this particular method getting added.  I think this is probably
>> going a bit too far.  It's not really generally meaningful;  you have two
>> clients of it, and one of them has to do weird special-casing anyway
>> (i.e. filling in -1 in the right place).
> 
> OK.  I ended up having to duplicate switches on the inheritance model,
> so I guess I might as well do it here.  I'll do a separate change on
> the AST layer to get rid of this helper.  Alternatively I could try to
> raise the memptr layout up into the AST layer and query it from here.

I think this is the right balance.  Ultimately, IR-gen needs knowledge of the
member-pointer layout that the AST really just doesn't.

>> +  case MSIM_Unspecified: {
>> +    CharUnits offs = getContext().getASTRecordLayout(RD).getVBPtrOffset();
>> +    llvm::Constant *VBTableOffset =
>> +      llvm::ConstantInt::get(CGM.IntTy, offs.getQuantity());
>> +    llvm::Constant *fields[] = {
>> +      FieldOffset,
>> +      VBTableOffset,
>> +      getZeroInt()    // FIXME: VirtualBaseAdjustmentOffset
>> +    };
>> 
>> The VirtualBaseAdjustmentOffset should be some special value to indicate
>> that this isn't a member of a virtual base;  you just need to leave that value
>> in there.
> 
> The first entry of a vbtable points to the base of the current
> subobject, so zero should work for that.

Cute.

>> Also, are you sure you're supposed to leave a meaningful vb-table offset
>> in here even when the member *isn't* a member of a virtual base?
> 
> I believe zero is correct for that, for the same reason as above.
> 
> It looks like taking a pointer to a member of a virtual base is an
> MSVC language extension.

Yep.  Or, to be more specific, MSVC implemented a very general feature
that got pared down by standardization.

> Clang rejects my attempts to form member
> pointers to virtual bases.  I filed http://llvm.org/PR15713 to track
> any decisions made on that front.  Currently, I don't have the decl so
> I can't assert without changing the API, so I'll just always emit zero
> and comment this.

The emission of a member pointer constant will never need to create a
member of a virtual base;  only member pointer conversion needs to do that.

This might not be obvious, but given:
  struct A { int x; };
  struct B : A {};
&B::x actually has type "int A::*", not "int B::*".  The class name there is
just used for lookup.

Anyway, yes, you'll need to teach Sema to allow these conversions.

>> Also, recall that the class very well might not have a vb-table;  there might
>> be a signal value for that, too.
> 
> You're right, I had to fix that.  Can't go dereferencing non-existent vbptrs.

Right.  I guess this is as simple as testing whether the offset is zero (the offset
within the vb-table, not the offset of the vbptr, which can legitimately be zero).

>> +  llvm::Type *Int8Ptr = Builder.getInt8Ty()->getPointerTo(0);
>> 
>> This is CGM.Int8PtrTy.
> 
> OK.
> 
>> +llvm::Value *
>> +MicrosoftCXXABI::AdjustVirtualBase(CGBuilderTy &Builder,
>> +                                   const CXXRecordDecl *RD, llvm::Value *Base,
>> +                                   llvm::Value *VirtualBaseAdjustmentOffset,
>> +                                   llvm::Value *VBTableOffset) {
>> 
>> This code is assuming that it *has* to adjust by a virtual base.
>> That is not the case (unless MSVC does something interesting
>> in its vb-table layouts, like leave a zero offset at a particular offset
>> in the table...), and even then it's suspect in the MSIM_unspecified
>> case where it's completely legitimate that the record type might not
>> have a vb-table.
> 
> Good catch: the first vbtable entry points back to the base of the
> current subobject.  That means it's safe to perform the adjustment so
> long as we know there is a vbptr.  MSVC does this so I followed suit.

Seems reasonable.

As an aside, I wonder what the right trade-off here actually is: an extra
branch vs. an unnecessary memory dependence...  Anyway, it's not
important right now.

+  // If the class has an explicit inheritance model attribute, we conservatively
+  // use -1.  This requires different TUs to see the same attribute or they have
+  // an ODR problem.
+  if (RD->hasAttr<MSInheritanceAttr>())
+    return true;

If this is really how MSVC handles this issue — and that seems reasonable
enough to me — then arguably these cases need to be distinguished in the
MSInheritanceModel enum.  I think we'd have six cases:
  - has no bases at non-zero offsets, known not polymorphic or unknown polymorphism
  - has no bases at non-zero offsets, known polymorphic
  - has bases at static non-zero offsets, known not polymorphic or unknown polymorphism
  - has bases at static non-zero offsets, known polymorphic
  - has virtual bases
  - unspecified

The inheritance model is fundamentally about member pointers.  It makes
sense for it to capture everything necessary in order to interpret a member
pointer.

That would also make it easier to diagnose the following invalid code (unless
we already diagnose attempts to add inheritance keywords after the definition?).
  class A {
    virtual void foo();
  };
  class __single_inheritance A;

+  // Since zero is a valid offset for classes without vtables, they use -1.  For
+  // some reason, classes with vtables use 0 in this ABI.  In this inheritance
+  // model, only polymorphic classes have vtables.
+  return !RD->isPolymorphic();

The reason's pretty straightforward:  it is generally better to use 0 instead of -1
because it's usually easier to produce 0 than -1.  That's true even on the
smallest level — on x86 it's a smaller instruction to load 0 into a register
than to load -1 — but more importantly, it means that the compiler can do things
like put a zero-initialized global in a zero-fill section or zero-initialize an
object with a simple memset.

+  // There is only one kind of class with data member pointers that can be zero
+  // initialized: single or multiple inheritance classes with a vfptr.

I'd put it this way:  MSVC uses 0 as the null offset if it can guarantee that
that won't point at a data member.  It can make that guarantee:
  - if the class is known to be polymorphic (because offset 0 will be the vf-table), or otherwise
  - if the class is known to have virtual bases (because offset 0 will be the vb-table)
class is known to have virtual bases (because offset 0 will be the vb-table).

The way that you've got this logic split between NullFieldOffsetIsAllOnes and
isZeroInitializable is really confusing to me.

+  if (Inheritance >= MSIM_Virtual)

Instead of doing checks like this, consider making little predicate functions:
  static bool hasVirtualBaseAdjustmentField(MSInheritanceModel);
It would remove the need for quite so many little comments everywhere.

+llvm::SmallVector<llvm::Constant *, 4>
+MicrosoftCXXABI::GetNullMemberPointerFields(const MemberPointerType *MPT) {

Please have this take a SmallVectorImpl<llvm::Constant*> by reference instead
of returning one.

+      // http://llvm.org/PR15713 : Taking a pointer to a member of a virtual
+      // base is not standard C++.  Clang will reject this in SemaCast, so we
+      // don't have to handle it here.

As covered above, when emitting a member pointer constant, you'll always
be making something in its declaring class.  This is why you can just ask
for the offset of a field instead of computing offsets all along the base path.

+  // in which case we need to skip the virtual base lookup.  If there were a
+  // vbtable, the first entry is a no-op entry that gives back the original

Grammar:  "If there is a vbtable..."

+  llvm::BasicBlock *OriginalBB;
+  llvm::BasicBlock *SkipAdjustBB;

You'll need to zero-initialize these in order to pacify compiler warnings.

+  // Add it to VBPtr.
+  llvm::Value *AdjustedBase = Builder.CreateInBoundsGEP(VBPtr, VBaseOffs);

On Win64, you need to be sure to sign-extend this.  (vb-offsets can be
negative in very particular circumstances.)

+  assert(MPT->isMemberFunctionPointer());
+  const FunctionProtoType *FPT =
+    MPT->getPointeeType()->getAs<FunctionProtoType>();

This can be castAs<>.

Otherwise, this is looking great, thanks!

John.



More information about the cfe-commits mailing list