[llvm-dev] Higher level program analysis

Fri Mar 29 09:02:51 PDT 2019

Hi Bekket, Please see the comment inline..

On 29/03/19 3:58 AM, Bekket McClane wrote:
> Hi,
>
> David:
> Good point, it will be interesting to see speculative compilation in 
> this context applying on devirtualization, with high level (type) 
> information if applicable.
>
>> On Mar 28, 2019, at 2:35 PM, preejackie <praveenvelliengiri at gmail.com 
>> <mailto:praveenvelliengiri at gmail.com>> wrote:
>>
>> Hi David & Bekket,
>>
>> Thanks your replies :)
>>
>> David, Indent here is: ORC v2 JIT APIs has initial(basic) support for 
>> concurrent compilation in background compile threads. This means we 
>> can speculatively compile functions before they are even referenced 
>> by the clients. In future, if the client refer some function and if 
>> that function is already compiled speculatively we can just return 
>> the address of function, if this is correctly done we can reduce the 
>> JIT compilation latencies. IMO, the possible candidates to 
>> speculate(functions) are those which will be executed next. We can 
>> use *program analysis and dynamic profiling* to find such functions.
>>
>> Three possible approaches are possible:
>>
>>  1. Based, on analysis from call graphs select functions to compile
>>     speculatively with less aggressive optimization flags in
>>     background dedicated compiler threads.
>>  2. Since it is a JIT, with the help of dynamic profiling we can find
>>     frequently executed functions and *re*compile them with more
>>     aggressive optimization in compile thread.
>>  3. With Profile-guide optimization results of previous executions,
>>     we can find function that are likely to compile next then
>>     *compile it with aggressive optimization. *[PGOs are app dependent]
>>
>> for cases 1,2: profile guide optimization results are not used. I 
>> hope these techniques collectively improve program execution time in 
>> long-time. Of course, program-based prediction is not equal to the 
>> accuracy of profile-based prediction, but in JIT it is useful to 
>> first compile function speculatively by using multiple threads.
>>
> I’m a little bit lost, from what I’m understanding, you’re arguing 
> that with the profiling data (collected from runtime), we can make 
> more precise prediction about callee candidates executed next. But in 
> the baseline(i.e. first-time execution) stage, these profile data 
> doesn’t exist, thus you want improve the prediction at that stage with 
> the help of some static analysis.
> Am I understanding correctly?

Not exactly, My thought is to make speculative compilation available for 
both non-PGO & PGO cases.

|--------------------------------------------------------------------------------------------|

   | llvm-IR   ----> ORC ----> Codegen + link ---> raw executable bits| 
        -----> Non-PGO case

    ORC uses program analysis information to guess functions to *compile 
*next.

|--------------------------------------------------------------------------------------------|

|--------------------------------------------------------------------------------------------------|

| llvm-IR    -----> ORC ----> Codegen + link -----> raw executable bits  
|    ----> PGO case.

   ORC uses the profile data to find functions to compile next + 
optimize hot functions aggressively.

|--------------------------------------------------------------------------------------------------| 

By this, we can't force JIT clients to use PGO, to get benefits of 
speculative compilation.

>> I have considered CFG as a higher level program representation, I 
>> maybe wrong here.
>>
>> For example:
>>
>> |void f2() {}
>> |
>>
>> |void f3() {}|
>>
>> |void  z(){|
>>
>> |if(/some condition/)
>> |
>>
>> | f2();f3();|
>>
>> |else
>> |
>>
>> |fn();|
>>
>> |}
>> |
>>
>> Follow the control flow of z and compute probability that one of the 
>> paths[entry to exit] within the z that lead to a call f2, if the call 
>> to f2 occurs in many paths, then the probability that it will execute 
>> next is high. It will require some control flow analysis.
>>
> Following my comment above, I think you have some misunderstandings on 
> the power of static analysis in LLVM: LLVM of course can eliminate 
> some trivial control flow structure like
> if(true) {
>  // do something
> }
> Or
> if(1 + 2 < 6)
> For these cases, you don’t even need to analyze the probability of 
> those branches, cause LLVM will just eliminate and optimize the entire 
> control structures.
>
> But other than that, it’s really hard for LLVM to evaluate the 
> probability of certain branches statically without help from (dynamic) 
> data like profiling data or certain programmer’s annotations like 
> __builtin_expect.
> The closest analysis I can come up with are probably LazyValueInfo and 
> ScalarEvolution. You can see if they fit your need.
>
Are you suggesting that static analysis is much inferior than profiling 
in finding "next executing function" ?  I'm stuck at deciding which one 
to prefer & implement during this summer project, Could you please help me?

Also if I choose to do PGO stuff with ORC, Can I able to use most of PGO 
code available for AOT with the JIT. Is there any downsides to it?

> Best,
> Bekket
>>
>> Challenges:
>>
>>  1.    To incorporate speculation in ORC v2.
>>  2.    Making speculative decisions faster, hence I decide to use
>>     simple heuristics.
>>
>> If you need more information / or feeling I'm missing something, 
>> Please leave a reply :)
>>
>> On 29/03/19 12:27 AM, David Greene wrote:
>>> Devirtualization is an example of predicting calls and is much more
>>> easily done on a higher-level representation.  It is simply easier to
>>> reason about certain things given information that is lost during
>>> translation to LLVM IR.  The MLIR project makes similar arguments.
>>>
>>> It would be helpful to know what's being attempted here.  I'm not sure
>>> what the (hardware?) branch predictor has to do with making decisions
>>> based compile-time information, unless some kind of PGO is being used.
>>> I could imagine something that takes branch probabilities and guesses
>>> the most likely path through a function, thus predicting certain calls
>>> will happen over others.
>>>
>>>                      -David
>>>
>>> Bekket McClane via llvm-dev<llvm-dev at lists.llvm.org>  writes:
>>>
>>>> Hi PreeJackie,
>>>>
>>>> I still have difficulties associating ‘higher level program analysis’ with the possible candidate functions that will be executed next.
>>>> Call graph will definitely be your tools(and actually it’s usually not considered ‘high level’), and function attributes might help. But AFAIC, there is little ‘high level’
>>>> language constructions that can help us determinate the possible functions executed next.
>>>> Maybe you can give us some examples?
>>>>
>>>> Best,
>>>> Bekket
>>>>
>>>>   On Mar 27, 2019, at 8:55 PM, preejackie via llvm-dev<llvm-dev at lists.llvm.org>  wrote:
>>>>
>>>>   Hi all,
>>>>
>>>>   I'm looking for some program analysis techniques which help me to find potential functions to execute next, from the current executing function. I want to
>>>>   decision based on compile time information. I consider LLVM IR is too low-level to make such analysis. So, I using call graph representation of module. I
>>>>   figured out the probability of function which execute next based on the branch predictor, Call instruction distance from the entry of function. I believe that
>>>>   many attributes can be derived from higher level program representation. Is there any similar work done like this? LLVM already support analysis for this?
>> -- 
>> Have a great day!
>> PreeJackie
>
-- 
Have a great day!
PreeJackie

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190329/a55f486d/attachment.html>