[llvm-dev] Existing studies on the benefits of pointer analysis
Jia Chen via llvm-dev
llvm-dev at lists.llvm.org
Tue Mar 15 13:37:20 PDT 2016
Dear llvm devs,
tl;dr: What prevents llvm from switching to a fancier pointer analysis?
Currently, there exists a variety of general-purpose alias analyses in
the LLVM codebase: basic-aa, globalsmodref-aa, tbaa, scev-aa, and
cfl-aa. However, only the first three are actually turned on when
invoking clang with -O2 or -O3 (please correct me if I'm wrong about this).
If one looks at existing research literatures, there are even more
algorithm to consider for doing pointer analysis. Some are
field-sensitive, some are field-based, some are flow-sensitive, some are
context-sensitive. Even for flow-insensitive ones, they could also be
inclusion-style (-andersen-aa) and equality-style (-steens-aa and
-ds-aa). Those algorithms are often backed up by rich theoretical
framework as well as preliminary evaluations which demonstrate their
superior precision and/or performance.
Given such an abundance choices of pointer analyses that seem to be much
better in the research land, why does real-world compiler
infrastructures like llvm still rely on those three simple (and ad-hoc)
ones to perform IR optimization? Based on my understanding (and again
please correct me if I am wrong):
(1) The minor reason: those "better" algorithms are very hard to
implement in a robust way and nobody seems to be interested in trying to
write and maintain them.
(2) The major reason: it's not clear whether those "better" algorithms
are actually better for llvm. More precise pointer analyses tend to slow
down compile time a lot while contributing too little to the
optimization passes that use them. The benefit one gets from a more
precise analysis may not justify the compile-time or the maintenance cost.
So my question here is: what kind(s) of precision really justify the
cost and what kinds do not? Has anybody done any study in the past to
evaluate what kinds of features in pointer analyses will benefit what
kinds of optimization passes? Could there potentially be more
improvement on pointer analysis precision without adding too much
compile-time/maintenance cost? Has the precision/performance tradeoffs
got fully explored before?
Any pointers will be much appreciated. No pun intended :)
PS1: To be more concrete, what I am looking for is not some black-box
information like "we switched from basic-aa to cfl-aa and observed 1%
improvement at runtime". I believe white-box studies such as "the licm
pass failed to hoist x instructions because -tbaa is not flow sensitive"
are much more interesting for understanding the problem here.
PS2: If no such evaluation exists in the past, I'd happy to do that
myself and report back my findings if anyone here is interested.
Department of Computer Science
University of Texas at Austin
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev