[clang] [IRPGO][ValueProfile] Instrument virtual table address that could be used to do virtual table address comparision for indirect-call-promotion. (PR #66825)

Teresa Johnson via cfe-commits cfe-commits at lists.llvm.org
Mon Oct 2 11:07:07 PDT 2023


teresajohnson wrote:

> > Interesting work! On my side I was thinking about a different approach using existing profiling and the whole program devirtualization (WPD) framework:
> > 
> > 1. Existing profiling identifies what targets an indirect call goes to
> > 2. With WPD information, we can identify if the targets are member function
> > 3. If this is a member function unique to a single vtable then we can elide the second load and be functionally equivalent
> > 
> > Directly value profiling the vtable captures more opportunities where the member function can exist in more than 1 vtable however getting that information in sample profiling is trickier which makes the previous approach more appealing.
> 
> Thanks for sharing thoughts!
> 
> I read a design doc from @teresajohnson that compares this two options; the doc points out type instrumentation could tell type distributions more accurately (obviously with instrumentation runtime overhead)
> 
> For example, `func` could be attributed back to more than one classes in the following class hierarchy, even if only one class is hot at runtime. IR instrumentation tells there is only one hot type and could insert one vtable comparison (and fallback to the original "load func -> call" path). If the type information is reversely reasoned from virtual functions, additional comparisons for non-hot types might be inserted in a hot code region.
> 
> ```
> class Base {
>    public:
>      virtual void func();
> };
> 
> class Derived : public Base {
>    public: 
>    // inherit func() without overriding
> };
> ```

Yes there are tradeoffs to doing this purely with whole program class hierarchy analysis vs with profiled type info, and in fact they can be complementary. For example, the profile info can indicate what order to do the vtable comparisons (i.e. descending order of hotness, as we do for vfunc comparisons in current ICP), while WP CHA can be used to determine when no fallback is required. Also, another advantage of doing this with profiling is also that it does not require WP visibility, which may be difficult to guarantee.

https://github.com/llvm/llvm-project/pull/66825


More information about the cfe-commits mailing list