<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Mar 15, 2016 at 1:37 PM, Jia Chen via llvm-dev <span dir="ltr"><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

  
  <div bgcolor="#FFFFFF" text="#000000">

    Dear llvm devs,<br>

    <br>

    tl;dr: What prevents llvm from switching to a fancier pointer

    analysis?<br></div></blockquote><div><br></div><div>Nothing.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000">

    <br>

    Currently, there exists a variety of general-purpose alias analyses

    in the LLVM codebase: basic-aa, globalsmodref-aa, tbaa, scev-aa, and

    cfl-aa. However, only the first three are actually turned on when

    invoking clang with -O2 or -O3 (please correct me if I'm wrong about

    this).<br></div></blockquote><div><br></div><div>This is correct.</div><div>Eventually, i hope george will have time to get back to CFL-AA and turn it on by default.</div><div> <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000">

    <br>

    If one looks at existing research literatures, there are even more

    algorithm to consider for doing pointer analysis. Some are

    field-sensitive, some are field-based, some are flow-sensitive, some

    are context-sensitive. Even for flow-insensitive ones, they could

    also be inclusion-style (-andersen-aa) and equality-style

    (-steens-aa and -ds-aa). Those algorithms are often backed up by

    rich theoretical framework as well as preliminary evaluations which

    demonstrate their superior precision and/or performance.<br></div></blockquote><div><br></div><div>CFL-AA is a middle ground between steens and anders, can be easily made field and context sensitive, etc.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000">

    <br>

    Given such an abundance choices of pointer analyses that seem to be

    much better in the research land, why does real-world compiler

    infrastructures like llvm still rely on those three simple (and

    ad-hoc) ones to perform IR optimization? </div></blockquote><div><br>Time and energy.</div><div> <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000">Based on my understanding

    (and again please correct me if I am wrong):<br>

    <br>

    (1) The minor reason: those "better" algorithms are very hard to

    implement in a robust way and nobody seems to be interested in

    trying to write and maintain them.<br></div></blockquote><div><br></div><div>This is false.  Heck, at the time i implemented it in GCC, field-sensitive andersen's analysis was unknown in production compilers.  That's why i'm thanked in all the papers - i did the engineering work to make it fast and reliable.</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000">

    (2) The major reason: it's not clear whether those "better"

    algorithms are actually better for llvm. More precise pointer

    analyses tend to slow down compile time a lot while contributing too

    little to the optimization passes that use them. The benefit one

    gets from a more precise analysis may not justify the compile-time

    or the maintenance cost.<br></div></blockquote><div><br></div><div><br></div><div>CFL-AA is probably the right trade-off here. You can stop at any time and have correct answers, you can be as lazy as you like.</div><div>etc.</div><div><br></div><div>The reality is i think you overlook the realistic answer:<br><br></div><div>3. Nobody has had time or energy to fix up CFL-AA or SCEV-AA. They spend their time on lower-hanging fruit until that lower hanging fruit is gone.</div><div><br></div><div>IE For the moment, CFL-AA and SCEV-AA and ... are not the thing holding llvm back.</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000">

    <br>

    So my question here is: what kind(s) of precision really justify the

    cost and what kinds do not?</div></blockquote><div><br></div><div>Depends entirely on your applications.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000"> Has anybody done any study in the past

    to evaluate what kinds of features in pointer analyses will benefit

    what kinds of optimization passes?</div></blockquote><div>Yes.</div><div>Chris did many years ago, and i've done one more recently.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000"> Could there potentially be more

    improvement on pointer analysis precision without adding too much

    compile-time/maintenance cost? </div></blockquote><div>Yes.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000">Has the precision/performance

    tradeoffs got fully explored before? <br></div></blockquote><div>Yes </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000">

    <br>

    
    Any pointers will be much appreciated. No pun intended :)<br>

    <br>

    PS1: To be more concrete, what I am looking for is not some

    black-box information like "we switched from basic-aa to cfl-aa and

    observed 1% improvement at runtime". I believe white-box studies

    such as "the licm pass failed to hoist x instructions because -tbaa

    is not flow sensitive" are much more interesting for understanding

    the problem here.<br></div></blockquote><div><br></div><div>White-box studies are very application specific, and often very pass specific.</div><div> <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000">

    <br>

    PS2: If no such evaluation exists in the past, I'd happy to do that

    myself and report back my findings if anyone here is interested.</div></blockquote><div>I don't think any of the world is set up to make that valuable.</div><div><br></div><div>Nothing takes advantage of context sensitivity, flow sensitivity, etc.</div><div> </div></div></div></div>