[clang] [IRPGO][ValueProfile] Instrument virtual table address that could be used to do virtual table address comparision for indirect-call-promotion. (PR #66825)

Mingming Liu via cfe-commits cfe-commits at lists.llvm.org
Mon Oct 9 17:07:19 PDT 2023


minglotus-6 wrote:

> > > > Yes there are tradeoffs to doing this purely with whole program class hierarchy analysis vs with profiled type info, and in fact they can be complementary. For example, the profile info can indicate what order to do the vtable comparisons (i.e. descending order of hotness, as we do for vfunc comparisons in current ICP), while WP CHA can be used to determine when no fallback is required. Also, another advantage of doing this with profiling is also that it does not require WP visibility, which may be difficult to guarantee.
> > > 
> > > 
> > > Gotcha, that makes sense. Are there plans on your side to extend this level of value profiling/WP CHA to AutoFDO? I'm looking into trying out the WP CHA approach on my side since it looks like there are cases it can catch in our internal workloads.
> > 
> > 
> > AutoFDO support is a natural follow-up for profile-gen. I'm currently working on having more vtable comparisons with class-hierarchy-analysis and do more devirtualization with type information.
> 
> Can you elaborate on what cases your current work is targeting? I was planning on starting work to catch the following:
> 
> ```
> class base
> {
>   virtual int foo() = 0;
> }
> 
> class derive1 : base
> { 
>   virtual int foo() {/*unique implementation*/};
> }
> 
> class derive2 : base
> { 
>   virtual int foo() {/*unique implementation*/};
> }
> 
> void callee(base* b)
> {
>   b->foo(); // profile information indicates target is primarily derive2::foo()
> }
> ```
> 
> Where we can directly compare vtable address instead of function address. If you're already working on this case then I don't want to step on your toes and just wait for your changes.

Thanks for clarification. Comparing vtable addresses was the first use case and I got a prototype and got [wins mentioned above](https://github.com/llvm/llvm-project/pull/66825#issuecomment-1741534866). One test case (https://gcc.godbolt.org/z/eqvz4WxGM) pasted into godbolt, and auto-generated `ICALL-FUNC` `ICALL-VTABLE` elaborates the expected transformations. Besides selective vtable comparison, I'm planning to work on the [dynamic type propagation](https://github.com/llvm/llvm-project/pull/66825#issuecomment-1741560195).

I could send out a draft patch about the vtable comparison (and thinlto import of the vtable variables) and a small RFC in the next few days.

> Is there a need to have a separate vtable name section? Merging the vtable names with function name table can make the implementation simpler.

The motivation to have a separate vtable name section at profile-gen time is to make profile-use easier. For example, in llvm-profdata.cpp around line 350 to line 356, reader gets all vtable names and add them to [indexed profile writer](https://github.com/llvm/llvm-project/blob/ac0dda894231e6281e7739aa0ea01a4e9697c747/llvm/tools/llvm-profdata/llvm-profdata.cpp#L214). A similar requirement to get all function names doesn't exist, since function names exist in `NamedInstrProfRecord`. 

The high level idea is 
1) have two new sections (vtable names, and per vtable profile data ) in the instrumented binary and change raw profile header/reader/write (per function offsets, etc) accordingly. This change comes with a version bump and less flexible to change.
2) The internal states (combining `MD5VTableMap` and `MD5FunctionMap`) have more flexibility to change (without version bump or profile header change, etc)

https://github.com/llvm/llvm-project/pull/66825


More information about the cfe-commits mailing list