[PATCH] D145516: [Inliner] Avoid excessive inlining through devirtualised calls

Jeremy Morse via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Apr 28 08:51:31 PDT 2023


jmorse added a comment.
Herald added a subscriber: hoy.

> I believe something along the lines of "F has a ref edge to NewCallee and NewCallee has a ref edge to itself (or maybe is in a non-trivial RefSCC?)", then this can occur? we want to be as precise as possible limiting when we add the inline cost penalty

I think that can be defeated by adding even more indirection to the calls such as the below, although a solution to some cases is better than none, consider:

  define internal void @rec_call_to_longname(ptr %longname, ptr %other) {
    call void %longname(ptr %other, ptr %longname)
    ret void
  }
  
  define internal void @longname(ptr %0, ptr %1) {
    call void @extern1()
    call void @extern1()
    call void @extern1()
    call void %0(ptr %1, ptr %0)  ; Exponential growth occurs between these two devirt calls.
    call void %0(ptr %1, ptr %0)
    ret void
  }
  
  define internal void @a() {
    call void @rec_call_to_longname(ptr @longname, ptr @rec_call_to_longname)
    ret void
  }
  
  [... rest of reproducer...]

This too is over-inlined and gets the multiplier added with the patch as it is.

I'm starting to get the feeling that we can't actually identify / distinguish these recursive cases because LLVM currently tracks whether an edge is a Call, a Ref, but not both. Specifically:

1. Every function vulnerable to the recursion must contain a ref to the recursive function, otherwise the recursion happens at some other stage of inlining,
2. The recursive-function must be devirtualised, directly called and inlined,
3. But leave a reference to the recursive-function in the vulnerable function.

Retaining the reference to the recursive-function is necessary for the function to be vulnerable, so that more calls to the recursive-function can be generated, devirtualised and inlined. Wheras general devirtualisation Should (TM) eliminate the reference. Here's the callgraph for the reproducer:

  Edges in function: extern1
                                                   
  Edges in function: rec_call_to_longname                 
    ref  -> rec_call_to_longname                            
                                                                 
  Edges in function: longname                      
    ref  -> longname  
                                                   
  Edges in function: e                             
    call -> d                                              
                                                   
  Edges in function: d                                      
    call -> c                                                    
                                                   
  Edges in function: c                  
    call -> b                                      
                                                   
  Edges in function: b                                     
    call -> a                                      
                                                   
  Edges in function: a                                    
    call -> rec_call_to_longname                            
    ref  -> longname                                             
                                                   
  RefSCC with 1 call SCCs:                         
    SCC with 1 functions:                          
      rec_call_to_longname                         
                                                           
  RefSCC with 1 call SCCs:                      
    SCC with 1 functions:                          
      longname                                            
                                                   
  RefSCC with 1 call SCCs:                                  
    SCC with 1 functions:                                        
      a                                            
                                                   
  [repeated similar RefSCC the same for b c d e]

If you compute a new callgraph while inlining is in progress and print it after every inline, there don't appear to be any RefSCCs with more than one, single, SCC. Instead, each function [a-e) switches from having a call to `rec_call_to_longname` and ref to `longname`, to having calls to both because the ref gets devirtualised, and finally the function gets fully inlined into the caller. I think in an ideal world I'd like to identify the above scenario and then annotate further calls in the devirtualised+inlined function with a cost multiplier.

I'm not quite sure how to implement that right now, but does the above rationale make sense? (I can dig further if it's sound).


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D145516/new/

https://reviews.llvm.org/D145516



More information about the llvm-commits mailing list