[PATCH] D73242: [WPD/LowerTypeTests] Delay lowering/removal of type tests until after ICP

Teresa Johnson via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Jan 30 06:49:54 PST 2020


tejohnson marked an inline comment as done.
tejohnson added a comment.

In D73242#1847051 <https://reviews.llvm.org/D73242#1847051>, @evgeny777 wrote:

> > This is an enabler for upcoming enhancements to indirect call promotion, for example streamlined promotion guard sequences that compare against vtable address instead of the target function
>
> Can you please describe the whole approach in more detail? At the moment ICP is capable to do (a sort of) devirtualization is replacing indirect vtbl call with sequence of function address comparisons and direct calls.
>  Are you going to speedup this by means of comparing vtable pointers instead of function pointers (thus eliminating a single load per each vtbl call) or there is also something else in mind?


That's exactly what we want to do here. We found a relatively significant number of cycles are being spent on virtual function pointer loads in these sequences, and by doing a vtable comparison instead, that is moved off the critical path. I had prototyped something like this in ICP awhile back and found a speedup in an important app.

> If that's true, what's the next
>  step? Make ICP pass analyze type test intrinsics?

There are a few ways to do the alternate ICP compare sequences, one is using statically available info from the vtable definitions in the module that utilize the profiled target. This relies on ThinLTO to import all the relevant vtable definitions. The other is to profile vtable addresses with FDO (not just the target function pointer) - I've got the type profiling implemented, but it needs some cleanup before I send for review. Both of these approaches need the type tests to determine the correct address point offset (the offset in the type test) to use in the compare sequence. And in both cases you want to trade off the number of comparisons needed for the two approaches to determine whether a vtable compare or a target function compare is better. I.e. if there are a lot of vtable definitions that utilize a hot target, it is likely better to do a single target function comparison.



================
Comment at: llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp:1660
+        cast<MetadataAsValue>(CI->getArgOperand(1))->getMetadata();
     // If we found any, add them to CallSlots.
     if (!Assumes.empty()) {
----------------
evgeny777 wrote:
> This change seems to be unrelated
It is needed to have the TypeId available outside this if statement (see the map check below).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D73242/new/

https://reviews.llvm.org/D73242





More information about the cfe-commits mailing list