[PATCH] D51905: Front-end of the implementation of the interleaving algorithm

Mon Apr 15 05:58:59 PDT 2019

zhaomo added a comment.

Addressed @rsmith's comments.

================
Comment at: clang/include/clang/Driver/CC1Options.td:356-357
+    HelpText<"Enable VTable interleaving">;
+def disable_vtable_interleaving : Flag<["-"], "disable-vtable-interleaving">,
+    HelpText<"Disable VTable interleaving">;
 //===----------------------------------------------------------------------===//
----------------
rsmith wrote:
> We usually only expose the non-default flag in `-cc1`, so that there are no ordering concerns and we can determine whether a feature is enabled with just `hasArg`. Also, `-fvtable-interleaving` would seem like a more natural flag name to me.
Changed it in the latest patch. 

================
Comment at: clang/include/clang/Frontend/CodeGenOptions.h:274
+  /// the interleaving layout is decided.
+  bool EnableVTableInterleaving;
+
----------------
rsmith wrote:
> `Enable` here seems redundant. Is thee a reason to declare this explicitly rather than adding it to CodeGenOptions.def?
Changed it in the latest patch.

================
Comment at: clang/lib/CodeGen/ItaniumCXXABI.cpp:973-976
+    // The name of a offset global variable has the format "__[type
+    // id]$[offset]".
+    std::string Name =
+        "__" + RD->getNameAsString() + "$" + std::to_string(Offset);
----------------
rsmith wrote:
> The `__`s here are atypical. If you want a name that can't collide with a name generated by the frontend, putting a period in it is the usual strategy. Also, a name prefix that indicates what this global is for would make the IR more readable and make this less likely to collide with other things. Maybe `clang.vtable.index.<CLASS>.<OFFSET>` or similar?
> 
> Also, there's no guarantee that `RD->getNameAsString()` produces something unique within the translation unit, so you'll need to make sure that name collisions are handled somehow (I *think* `GlobalVariable` already deals with this for you, but I'm not sure... however, I think its strategy is to add a number to the end of the mangling, which will lead to some quite peculiar results given that you already have a number there). It might be easier to base the name of this symbol on the mangled name of the class; then you could give the global variable the same linkage as the class, which might save your interleaving pass a little bit of work later on as the duplicates will already have been combined -- but that's up to you, and I don't have enough of the big picture here to give you advice.
The encoding of the name of such a global variable is just for debugging purposes. The back-end pass does not care about the name. Instead, it relies on the metadata associated with the global variable, and the metadata does not rely on `RD->getNameAsString() `. Also, the back-end pass does not care how many global variables associated with the same metadata.

I did not know that it is using periods is a convention in LLVM. I changed the name to `clang.vtable.offset.<CLASS>.<OFFSET>`.

================
Comment at: clang/lib/CodeGen/ItaniumCXXABI.cpp:987
+  }
+  return llvm::ConstantExpr::getPtrToInt(OffsetGV, OffsetType);
+}
----------------
rsmith wrote:
> Representing this offset as a `ptrtoint` on a global variable seems really strange. Is there really no better way to model this in IR? (I mean, if not, then carry on as you are, but this seems like a hack.)
The global variable we create for each referenced vtable offset is actually a zero-length array, and we use the address of the array as the placeholder for a referenced vtable offset. I was told that this is a common trick used in LLVM. 

@pcc: Peter, would you provide more information here? 

================
Comment at: clang/lib/CodeGen/ItaniumCXXABI.cpp:1767

+  bool InBounds = shouldInterleaveVTables(VTableClass) ? false : true;
   return llvm::ConstantExpr::getGetElementPtr(VTable->getValueType(), VTable,
----------------
rsmith wrote:
> zhaomo wrote:
> > pcc wrote:
> > > Remind me why this needs to be here? (And the explanation needs to be in a comment.)
> > The calculation of address point is essentially base_vtable_addr + offset, where offset is from the indices of gep.
> > In the interleaving pass, we replace base_vtable_addr with (addr_point_in_interleaved_layout - offset).
> > 
> > The LLVM language reference says that the base address of a inbounds gep must be an in bound address of the object. The new base address addr_point_in_interleaved_layout - offset, however, may not be an in bound address.
> I still find the need for this confusing. Suppose we have:
> 
> ```
> struct A { virtual void f(); };
> struct B { virtual void g(); virtual void h(); };
> struct C : A, B {};
> ```
> 
> such that `C` has a naive vtable containing 7 elements {{offset, typeid, &A::f}, {offset, typeid, &B::f, &B::h}}, with address point indexes (0, 2) and (1, 2). Here we emit a `gep inbounds @vtable, 0, 1, 2` to get the B-in-C vtable. Now, if this were instead emitted as `gep inbounds (cast @vtable to i8*), 24`, and the interleaving process were replacing @vtable with a non-`inbounds` GEP on its interleaved ubervtable, I could understand the need to drop `inbounds` here -- because the replacement for @vtable might be before the start of the ubervtable. But based on my reading of the algorithm description, that's not what happens: first we split `@vtable` into multiple individual vtables, and that splitting process presumably must identify these GEPs to figure out which slice of the vtable they need. That presumably either removes this GEP entirely, or leaves this being a correct and trivial `inbounds` GEP on a split-out piece of the vtable. (I'm imagining we effectively rewrite `gep inbounds @vtable, 0, 1, 2` as `@vtable-slice-2` as part of the splitting.) And then after interleaving, I'm imagining we replace all uses of `@vtable-slice-2` with the appropriate address point in the ubervtable, which is again an `inbounds` GEP (but that would be a new GEP, unrelated to this one).
> 
> That is, if you need this change, it seems like that indicates a bug in the vtable splitter. But maybe I'm misunderstanding how this all fits together.
GlobalSplit, the pass that splits vtables, relies on `inrange` instead of `inbounds`. 

>From the language reference:

> If the inrange keyword is present before any index, loading from or storing to any pointer derived from the getelementptr has undefined behavior if the load or store would access memory outside of the bounds of the element selected by the index marked as inrange. 

`inbounds` means something different. Again from the language reference:

> If the inbounds keyword is present,** the result value of the getelementptr is a poison value if the base pointer is not an in bounds address of an allocated object**, or if any of the addresses that would be formed by successive addition of the offsets implied by the indices to the base address with infinitely precise signed arithmetic are not an in bounds address of that allocated object.

We need to drop `inbounds` because that the **base pointer** may not be in bounds address of an interleaved vtable.

Here is how an address point is calculated in an original vtable. `original_address_pt` = `start_original_vtable` + `original_offset`, where `start_original_vtable` is the address of the original vtable and `original_offset` is the offset of the address point in the original vtable.

Our interleaving pass calculates the address of this address point in the interleaved object. Let's call it `new_address_pt` and `new_address_pt` = `start_interleaved_vtable` + `new_offset`.
Then we replace `start_original_vtable` with `(new_address_pt - original_offset)`. Because `start_original_vtable` is the **base pointer** of the `gep`, and `new_offset` may be smaller than `original_offset`, we have to drop inbounds. 

> I'm imagining we effectively rewrite gep inbounds @vtable, 0, 1, 2 as @vtable-slice-2 
gep inbounds @vtable, 0, 1, 2 is rewritten to gep inbounds @vtable-slice-2, 0, 2 in your example.

================
Comment at: clang/lib/CodeGen/ItaniumCXXABI.cpp:1832
+  auto *RecordDecl = MethodDecl->getParent();
+  llvm::Value *VTable = CGF.GetVTablePtr(This, CGM.Int8PtrTy, RecordDecl);

----------------
rsmith wrote:
> Do you really need to change this to load the vptr at the "wrong" type and then cast it? (Could the cast, if necessary at all, live in `EmitVTableTypeCheckedLoad` instead?)
I have to because both branches of the if statement rely on `OffsetConstant`, which is a byte offset.

================
Comment at: clang/lib/CodeGen/ItaniumCXXABI.cpp:3618-3622
+  if (!CXXABI.shouldInterleaveVTables(RD)) {
+    uint64_t UpperBits = (uint64_t)Offset << 8;
+    uint64_t OffsetFlags = UpperBits | LowerBits;
+    return llvm::ConstantInt::get(OffsetFlagsTy, OffsetFlags);
+  }
----------------
rsmith wrote:
> Shouldn't you also do this if the base is non-virtual? In that case,`Offset` is an offset within the class layout, not an index into the vtable.
You are right. This bug is fixed in the latest patch.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D51905/new/

https://reviews.llvm.org/D51905