[llvm-dev] Existing studies on the benefits of pointer analysis

Mon Mar 21 18:28:34 PDT 2016


On 03/21/2016 08:56 AM, Jia Chen via llvm-dev wrote:
> Hi Christian,
>
> Thank you so much for the reply! Please see my comments inline.
>
> On 03/21/2016 09:32 AM, Christian Convey wrote:
>> Hi Jia,
>>
>>     If one looks at existing research literatures, there are even
>>     more algorithm to consider for doing pointer analysis.
>>
>>
>> For at least some published AA algorithms, there may be some 
>> uncertainty about software patents and/or copyright.
>>
>> At one point I was interested in the status of this AA implementation 
>> <https://dl.acm.org/citation.cfm?id=2466483> by Lian Li et al.  IIRC, 
>> when I contacted Lian to ask if there was any chance of getting it 
>> into LLVM, IIRC she said that her employer wouldn't promise to 
>> relinquish all possible claims it had on that algorithm's IP.  So 
>> unfortunately, at least in the U.S., an algorithm being published in 
>> an academic journal doesn't remove all legal risk associated with 
>> using it.
> This is news to me. Thanks for sharing it.
>>
>> Also, speaking from my own experience, even when an AA algorithm 
>> seems to be described in great detail in some piece of literature 
>> (e.g., a phd thesis), there can still be a lot of details which are 
>> glossed over, or which seem clear when reading the document but which 
>> get a lot more confusing when one tries to actually implement it.
>>
>> That can make implementing such an algorithm take far longer than one 
>> would expect based on just reading the document.  (It's also an 
>> argument in favor of requiring academic papers which describe the 
>> behavior of a software implementation to actually include a working 
>> copy of the source code, IMHO.)
> My personal experience also coincides. And even if the paper does come 
> with an artifact or source codes, they are usually proof-of-concept 
> implementations with lots of important real-world corner cases ignored.
>>
>>     So my question here is: what kind(s) of precision really justify
>>     the cost and what kinds do not? Has anybody done any study in the
>>     past to evaluate what kinds of features in pointer analyses will
>>     benefit what kinds of optimization passes?
>>
>>
>> At one point I discussed this with Daniel Berlin, but I'm having 
>> trouble find a record of the conversation.  IIRC, he says that he 
>> once threw a huge amount of computing power at doing a /full/ 
>> context-sensitive AA on some software, and the speedup he observed in 
>> the resulting program as underwhelming (10-15%?).
> I kind of expect that. As you mentioned later, most optimization 
> passes work in a context-insensitive manner (i.e. they won't clone a 
> function and optimize differently on different clones). Context 
> sensitivity on the pointer analysis side is probably not going to help 
> a lot if the client cannot fully capitalize on it. In the settings of 
> compiler optimization, my guess is that flow sensitivity, field 
> sensitivity, heap model and external library annotations are the four 
> aspects that are likely to matter.
>
> I did some preliminary experiments with licm on c programs over the 
> last weekend. I chose licm because intuitively that's the optimization 
> that could have the biggest performance impact. The result suggested 
> that tbaa, cfl-aa, scev-aa and globals-aa yields very little 
> additional benefits over basic-aa. Again, both the methodology and 
> benchmark selection are very immature and the results need to be 
> double-checked, but my hope is that by looking at how aa algorithms 
> and their clients interact I may be able to get some hints on what 
> kind of aa a compiler really wants.
Just to chime in here, your results match my experience and expectations 
with LICM as well.  Between basic-aa, and TBAA (specifically LLVM's 
implementation thereof), I haven't seen a lot of cases where an 
imprecision in the alias analysis prevents hoisting.

*However*, if you're interested in LICM specifically, I have 
*definitely* seen cases where the precision of AliasSetTracker (our 
grouping of AA results to prevent O(n^2) queries) prevents hoisting in 
spurious cases.  AST could use some serious attention, both from an 
engineering standpoint and from (possibly) a theoretically one.
>>
>> I can't remember if that was with GCC or LLVM.  That result is a data 
>> point, although it may not say much about how much additional speedup 
>> could be realized if the algorithms which use the AA results were 
>> themselves adapted to capitalize on fully context-sensitive, 
>> flow-sensitive, hula-dancer-on-the-dashboard AA results.
>>
>> Cheers,
>> Christian
>>
>
> -- 
> Best Regards,
>
> --
> Jia Chen
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160321/0813de43/attachment.html>