[PATCH] D93838: [SCCP] Add Function Specialization pass

Thu May 20 08:53:29 PDT 2021

davidxl added a comment.

In D93838#2770951 <https://reviews.llvm.org/D93838#2770951>, @JonChesterfield wrote:

> In D93838#2770233 <https://reviews.llvm.org/D93838#2770233>, @davidxl wrote:
>
>> My question is other than MCF, do we have other real world app that can benefit from this optimization (that can not be done by inliner)?
>
> An alternative perspective. An inliner does two things. It elides call setup/return code, and it specialises the function on the call site. These can be, and probably should be, separable concerns.
>
> Today we inline very aggressively, which is suboptimal for platforms with code size (or cache) restrictions, but does give the call site specialisation effect. So this patch, today, needs a large enough function to avoid being specialised by the inliner to see a benefit. Examples will be things like qsort or large switch statements on a parameter.

The benefit of inlining comes from many different areas:

1. call overhead reduction (call, pro/epilogue)
2. inline instance body cleanup with callsite contexts (this is what specialization can get)
3. cross procedure boundary optimizations -- 3.1) PRE, jump threading etc. between caller body and inline instances 3.2) cross function optimization between sibling calls (sibling inline instances) 3.3) better code layout of inline instance body with enclosing call context ..

This is why with PGO, we have very large inline threshold setup for hot callsites.

For function specialization with PGO, we can use profile data to selectively do function cloning, but then again, it is very likely better to be inlined given its hotness.

I agree function specialization has its place when size is the concern (with -Os), or instruction working set is too large (to hurt performance). We don't have a mechanism to analyze the latter yet.

> With a specialisation pass in tree we can start backing off the inliner. Calling conventions do have overhead, but I suspect the majority of the performance win of inline is from the specialisation. If that intuition is sound, then this plus a less aggressive inliner will beat the status quo through better icache utilisation. Performance tests of Os may validate that expectation

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93838/new/

https://reviews.llvm.org/D93838