[LLVMdev] Contribute a new precise pointer analysis to LLVM

Fri Oct 18 07:27:10 PDT 2013

Hi Daniel,

I want to clarify that our analysis is not based on CFL-reachability.
We apply CFL-reachability to matching context information where the
exist from a function to a call-site must match
the entry from the corresponding call-site. The problem is a simple
balanced parentheses problem in CFL-reachability, and it can be
computed
efficiently.

The paper you mentioned is a very nice paper that proposed an
efficient algorithm to solve Dyck-CFL-reachability problem in
bidirectional graphs. However, precise pointer analysis cannot be
formulated as a Dyck-CFL-reachability problem. The alias analysis
presented in that paper is an over approximation. Hence the time bound
from our paper still applies.

Regards,
Lian

On Fri, Oct 18, 2013 at 4:39 PM, Daniel Berlin <dberlin at dberlin.org> wrote:
> I notice you guys formulate your CFL reachability problem as a
> balanced parentheses problem.
> What algorithm do you use to solve it?
>
> Are you aware of recent work that comes up with linear time and n log
> n time algorithms to solve this class of problems:
> http://www.cse.cuhk.edu.hk/lyu/_media/paper/pldi2013.pdf
>
> In particular, the time bound from the paper:
>
> "However, if we need the precise pointer information for all
> variables, CFL-based
> pointer analysis is generally more expensive as it has O(L^3 N^3)
> complexity [29], where L is the size of the grammar and N is the
> size of the graph."
>
> should now been reduced to  O(n + m log m) time, assuming you are
> using a standard balanced parentheses formulation.
>
>
>
>
> On Thu, Oct 17, 2013 at 9:51 PM, lian li <lianli at gmail.com> wrote:
>> Hi Hal,
>>
>> Thanks for your interest.
>>
>> We tested with the following existing compiler optimizations in LLVM
>> with SPECINT2006 benchmarks:
>> -dse (dead store elimination),
>> -gvn (global value numbering),
>> -licm (loop invariant code motion),
>> -bb-vectorize (basic block vectorization),
>> -memcpyopt (memcpy optimization),
>> -sink (code sinking),
>> -loop-idom (recognize loop idioms),
>> -argpromotion (argument promotion).
>>
>> As far as I know, this list includes all optmizations in LLVM that
>> requires alias information. With our pointer analysis, for the three
>> optimizations "-dse, -gvn, -licm", we have observed significant
>> improvements in terms of the number of optimizations being conducted.
>> Take licm as an example, with our analysis, the number of instructions
>> being hoisted out of loops more than doubled for some benchmarks. We
>> have not seen big changes in terms of runtime performance though.
>>
>> We measure compile time by first linking all modules of the same
>> benchmark together, then running the analysis against the linked
>> module. It finishes in less than 1 minute over applications such as
>> httpd and sendmail, both with more than 100,000 LOC. For some
>> benchmarks, it is time-consuming: the analysis time for GCC is close
>> to 45 minutes.  But if you run the analysis over individual files (as
>> in your normal compilation process), there is no observable overhead
>> in terms of compile time.
>>
>> Regards,
>> Lian
>>
>>
>>
>> On Fri, Oct 18, 2013 at 12:15 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>>> Hi Lian,
>>>
>>> I am certainly interested in seeing this; do you have performance numbers (compile time)? Also, can you share more information about the promising optimization results you mentioned?
>>>
>>> Thanks,
>>> Hal
>>>
>>> ----- Original Message -----
>>>> Hi All,
>>>>
>>>> This is Lian Li  from Oracle Labs in Brisbane Australia.
>>>>
>>>> We have developed a precise and highly efficient pointer analysis
>>>> framework on top of LLVM,  The approach is flow, context, and field
>>>> sensitive, details are described in the two papers below:
>>>>
>>>> "Boosting the performance of flow-sensitive points-to analysis using
>>>> value flow" (in ESEC-FSE 2011), and
>>>> "Precise and scalable context-sensitive pointer analysis via value
>>>> flow graph" (in ISMM 2013).
>>>>
>>>> The analysis was initially developed for the purpose of bug checking,
>>>> and is now extended to support compiler optimizations. We have tested
>>>> it with existing compiler optimizations in LLVM, and have seen
>>>> promising results.
>>>>
>>>> We are now considering to make this analysis available to the LLVM
>>>> community, and contribute resources for future maintenance needs,
>>>> provided that there is enough interest. We think that a new precise
>>>> pointer analysis in LLVM can enable more new optimization and
>>>> analysis
>>>> techniques to be developed in LLVM.
>>>>
>>>> Any people interested in seeing a new precise pointer analysis in
>>>> LLVM?
>>>>
>>>> Regards,
>>>> Lian Li
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>
>>>
>>> --
>>> Hal Finkel
>>> Assistant Computational Scientist
>>> Leadership Computing Facility
>>> Argonne National Laboratory
>>
>>
>>
>> --
>> // Copyright @ 2011 Authorized by ** LIAN LI **
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

-- 
// Copyright @ 2011 Authorized by ** LIAN LI **