[PATCH] D73242: [WPD/LowerTypeTests] Delay lowering/removal of type tests until after ICP

Eugene Leviant via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Feb 5 00:32:18 PST 2020


evgeny777 added inline comments.


================
Comment at: llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp:1678
+    // breaks any uses on assumes.
+    if (TypeIdMap.count(TypeId))
+      continue;
----------------
tejohnson wrote:
> evgeny777 wrote:
> > tejohnson wrote:
> > > evgeny777 wrote:
> > > > tejohnson wrote:
> > > > > evgeny777 wrote:
> > > > > > I don't think, I understand this.
> > > > > > It looks like that if (a) we have type.test in the module and (b) we don't have vtable definition with corresponding type metadata in the same module then we'll remove type.test/assume sequence before even getting to ICP. This seems strange in the context of previous discussion because virtual function may be called in different module from one where vtable is defined, like so:
> > > > > > ```
> > > > > > struct A { virtual int foo() { return 42; } };
> > > > > > 
> > > > > > // calls pA->foo
> > > > > > extern int callFoo(A *pA);
> > > > > > int main() { A a; return callFoo(&a); }
> > > > > > ```
> > > > > > 
> > > > > We will only be able to do ICP this way if we have the target vtable in the module (similar to how we currently can only do ICP if the target function is in the module). I anticipate doing something similar for vtable defs as to what we do for virtual functions, where a fake call edge is added to the summary based on the value profile results to ensure they are imported before the LTO backend ICP.
> > > > > We will only be able to do ICP this way if we have the target vtable in the module (similar to how we currently can only do ICP if the target function is in the module).
> > > > 
> > > > I was thinking of slightly different approach: it seems you have required type information in combined index together with type name in the module, so you probably can add external declarations of required vtables (this may require promotion) to the module in the ICP pass and run ICP over them. Do you think this can work?
> > > Possibly, but we'd still need to have type metadata on those vtable declarations, because the type metadata provides the address point offset which is needed in the comparison sequence.
> > > Possibly, but we'd still need to have type metadata on those vtable declarations, because the type metadata provides the address point offset which is needed in the comparison sequence.
> > 
> > I think it's stored in the index in VFuncId structures. Isn't it?
> > I think it's stored in the index in VFuncId structures. Isn't it?
> 
> No, that holds the offset from the address point to the virtual function within the vtable def, not the address point offset itself, which is what we need from the type metadata. But actually, we need both offsets:
> 
> Consider the following example:
> 
> ```
> class A {
>    virtual void foo();
> }
> 
> void bar(A *a) {
>    a->foo();
> }
> ```
> 
> The indirect call sequence will look like (not showing the type test for simplicity):
> 
> ```
>  %0 = bitcast %class.A* %a to i32 (%class.A*)***
>   %vtable = load i32 (%class.A*)**, i32 (%class.A*)*** %0
>   %1 = load i32 (%class.A*)*, i32 (%class.A*)** %vtable
>   %call = tail call i32 %1(%class.A* %a)
> ```
> 
> With type profiling we will profile the value of %vtable, and figure out that it is associated with the symbol for A's vtable (the profiling support uses symbol table info for this), which is:
> 
> ```
> @_ZTV1A = dso_local unnamed_addr constant { [3 x i8*] } { [3 x i8*] [i8* null, i8* bitcast ({ i8*, i8* }* @_ZTI1A to i8*), i8* bitcast (i32 (%class.A*)* @_ZN1A3fooEv to i8*)] }, align 8, !type !0
> 
> !0 = !{i64 16, !"_ZTS1A"}
> ```
> 
> When we promote the call using the vtable profile results, we need to add the address point offset (16) from the type metadata to the profiled result which will be the symbol @_ZTV1A, before comparing against %vtable.
> 
> We then need to look at the IR and compute the offset to the function address used there (0 in the above IR), and add it to the address point offset, resulting in 16 here, looking in the vtable def to see what function is at the combined offset (_ZN1A3fooEv here), so that we insert a direct call to that function.
> 
> For the address point offset, we only have it in the summary for index-only WPD (TypeIdCompatibleVtableMap). However, I don't want to restrict the ICP improvements to that build mode.
> 
> For the latter, the VFuncId does summarize the offsets, but we can compute it from the IR as I described above (it's the amount added to %vtable before loading the virtual function pointer) and don't need the summary. And in any case the VFuncId doesn't tell us which function is at the aggregate offset, we need the vtable def to tell us that (or the vTableFuncs array saved on the GlobalVarSummary, again only in index-only WPD mode).
> 
> Also, I believe we can support this type profiling based ICP in non-ThinLTO mode as well - with my recent changes to make WPD rely on vcall visibility info, we should be able to insert both the type tests and the type metadata into the code stream in non-ThinLTO mode. In that case, similar to the way ICP currently behaves for target functions, the def would have to happen to already be in the current module for the vtable compare promotion to be occur. 
> 
> For the above reasons, I'd prefer to rely on the type metadata on vtable defs, with the defs imported as much as possible in ThinLTO mode.
Okay, got it. One question: why do you insist on removing such type.test/assume sequences here, instead of simply ignoring them in ICP and eliminating them altogether in LowerTypeTests?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D73242/new/

https://reviews.llvm.org/D73242





More information about the cfe-commits mailing list