[llvm] [IRPGO][ValueProfile] Instrument virtual table address that could be used to do virtual table address comparision for indirect-call-promotion. (PR #66825)

David Li via llvm-commits llvm-commits at lists.llvm.org
Mon Oct 9 18:27:50 PDT 2023


david-xl wrote:

> > > > > > Yes there are tradeoffs to doing this purely with whole program class hierarchy analysis vs with profiled type info, and in fact they can be complementary. For example, the profile info can indicate what order to do the vtable comparisons (i.e. descending order of hotness, as we do for vfunc comparisons in current ICP), while WP CHA can be used to determine when no fallback is required. Also, another advantage of doing this with profiling is also that it does not require WP visibility, which may be difficult to guarantee.
> > > > > 
> > > > > 
> > > > > Gotcha, that makes sense. Are there plans on your side to extend this level of value profiling/WP CHA to AutoFDO? I'm looking into trying out the WP CHA approach on my side since it looks like there are cases it can catch in our internal workloads.
> > > > 
> > > > 
> > > > AutoFDO support is a natural follow-up for profile-gen. I'm currently working on having more vtable comparisons with class-hierarchy-analysis and do more devirtualization with type information.
> > > 
> > > 
> > > Can you elaborate on what cases your current work is targeting? I was planning on starting work to catch the following:
> > > ```
> > > class base
> > > {
> > >   virtual int foo() = 0;
> > > }
> > > 
> > > class derive1 : base
> > > { 
> > >   virtual int foo() {/*unique implementation*/};
> > > }
> > > 
> > > class derive2 : base
> > > { 
> > >   virtual int foo() {/*unique implementation*/};
> > > }
> > > 
> > > void callee(base* b)
> > > {
> > >   b->foo(); // profile information indicates target is primarily derive2::foo()
> > > }
> > > ```
> > > 
> > > 
> > >     
> > >       
> > >     
> > > 
> > >       
> > >     
> > > 
> > >     
> > >   
> > > Where we can directly compare vtable address instead of function address. If you're already working on this case then I don't want to step on your toes and just wait for your changes.
> > 
> > 
> > Thanks for clarification. Comparing vtable addresses was the first use case and I got a prototype and got [wins mentioned above](https://github.com/llvm/llvm-project/pull/66825#issuecomment-1741534866). One test case (https://gcc.godbolt.org/z/eqvz4WxGM) pasted into godbolt, and auto-generated `ICALL-FUNC` `ICALL-VTABLE` elaborates the expected transformations. Besides selective vtable comparison, I'm planning to work on the [dynamic type propagation](https://github.com/llvm/llvm-project/pull/66825#issuecomment-1741560195).
> > I could send out a draft patch about the vtable comparison (and thinlto import of the vtable variables) and a small RFC in the next few days.
> 
> Okay I think we're targeting different cases. My scenario is AutoFDO without value profiling: relying on branch samples and optimizing if a member function is unique to a vtable. The result is ultimately the same where we change the code to compare directly against the vtable. @teresajohnson @david-xl any objections to me starting this work?

If I understand correctly, you plan to use the existing branch profiling to do devirtualization using vtable comparison. I think this is a good extension for you to work on.

The AutoFDO support Mingming mentioned is the vtable profiling part using MEM_INST_RETIRED event that captures data address. This data access profiling will/can also be used for global variable layout. However this is current Intel only so having a branch profiling based method can be useful overall.



https://github.com/llvm/llvm-project/pull/66825


More information about the llvm-commits mailing list