<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Mar 15, 2016 at 1:37 PM, Jia Chen via llvm-dev <span dir="ltr"><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
Dear llvm devs,<br>
<br>
tl;dr: What prevents llvm from switching to a fancier pointer
analysis?<br></div></blockquote><div><br></div><div>Nothing.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000">
<br>
Currently, there exists a variety of general-purpose alias analyses
in the LLVM codebase: basic-aa, globalsmodref-aa, tbaa, scev-aa, and
cfl-aa. However, only the first three are actually turned on when
invoking clang with -O2 or -O3 (please correct me if I'm wrong about
this).<br></div></blockquote><div><br></div><div>This is correct.</div><div>Eventually, i hope george will have time to get back to CFL-AA and turn it on by default.</div><div> <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000">
<br>
If one looks at existing research literatures, there are even more
algorithm to consider for doing pointer analysis. Some are
field-sensitive, some are field-based, some are flow-sensitive, some
are context-sensitive. Even for flow-insensitive ones, they could
also be inclusion-style (-andersen-aa) and equality-style
(-steens-aa and -ds-aa). Those algorithms are often backed up by
rich theoretical framework as well as preliminary evaluations which
demonstrate their superior precision and/or performance.<br></div></blockquote><div><br></div><div>CFL-AA is a middle ground between steens and anders, can be easily made field and context sensitive, etc.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000">
<br>
Given such an abundance choices of pointer analyses that seem to be
much better in the research land, why does real-world compiler
infrastructures like llvm still rely on those three simple (and
ad-hoc) ones to perform IR optimization? </div></blockquote><div><br>Time and energy.</div><div> <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000">Based on my understanding
(and again please correct me if I am wrong):<br>
<br>
(1) The minor reason: those "better" algorithms are very hard to
implement in a robust way and nobody seems to be interested in
trying to write and maintain them.<br></div></blockquote><div><br></div><div>This is false. Heck, at the time i implemented it in GCC, field-sensitive andersen's analysis was unknown in production compilers. That's why i'm thanked in all the papers - i did the engineering work to make it fast and reliable.</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000">
(2) The major reason: it's not clear whether those "better"
algorithms are actually better for llvm. More precise pointer
analyses tend to slow down compile time a lot while contributing too
little to the optimization passes that use them. The benefit one
gets from a more precise analysis may not justify the compile-time
or the maintenance cost.<br></div></blockquote><div><br></div><div><br></div><div>CFL-AA is probably the right trade-off here. You can stop at any time and have correct answers, you can be as lazy as you like.</div><div>etc.</div><div><br></div><div>The reality is i think you overlook the realistic answer:<br><br></div><div>3. Nobody has had time or energy to fix up CFL-AA or SCEV-AA. They spend their time on lower-hanging fruit until that lower hanging fruit is gone.</div><div><br></div><div>IE For the moment, CFL-AA and SCEV-AA and ... are not the thing holding llvm back.</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000">
<br>
So my question here is: what kind(s) of precision really justify the
cost and what kinds do not?</div></blockquote><div><br></div><div>Depends entirely on your applications.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000"> Has anybody done any study in the past
to evaluate what kinds of features in pointer analyses will benefit
what kinds of optimization passes?</div></blockquote><div>Yes.</div><div>Chris did many years ago, and i've done one more recently.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000"> Could there potentially be more
improvement on pointer analysis precision without adding too much
compile-time/maintenance cost? </div></blockquote><div>Yes.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000">Has the precision/performance
tradeoffs got fully explored before? <br></div></blockquote><div>Yes </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000">
<br>
Any pointers will be much appreciated. No pun intended :)<br>
<br>
PS1: To be more concrete, what I am looking for is not some
black-box information like "we switched from basic-aa to cfl-aa and
observed 1% improvement at runtime". I believe white-box studies
such as "the licm pass failed to hoist x instructions because -tbaa
is not flow sensitive" are much more interesting for understanding
the problem here.<br></div></blockquote><div><br></div><div>White-box studies are very application specific, and often very pass specific.</div><div> <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000">
<br>
PS2: If no such evaluation exists in the past, I'd happy to do that
myself and report back my findings if anyone here is interested.</div></blockquote><div>I don't think any of the world is set up to make that valuable.</div><div><br></div><div>Nothing takes advantage of context sensitivity, flow sensitivity, etc.</div><div> </div></div></div></div>