[LLVMdev] RFC: A plan for stateful alias analysis in LLVM

Mon Jul 13 19:59:10 PDT 2015

# Front matter #
First, I want to emphasize that this is not a short-term plan. This is a
long-term plan. When I was looking at refactoring the existing Alias
Analysis infrastructure in LLVM in ways that would break stateful AA, Hal
Finkel asked me to figure out how things would work with the new pass
manager and make sure that was sane.

This plan assumes the new pass manager. There is nothing deeply or
intrinsically tied to it, but the pieces will fit together naturally there,
and I don't see how to make them fit naturally in the existing pass
manager. I've not invested a lot of time in figuring that out as it seems a
sunk cost.

# Core ideas of what stateful AA should look like: #
1) An alias analysis *is* an analysis pass, and if it needs to build up
state, it must do so for a "unit" of IR at a time (using the new PM's
terminology).
2) Per-IR-unit invalidation from the pass manager is the primary mechanism
to clear out stale (and thus invalid) state from any alias analysis. That
means that most passes will not preserve AA by default (unlike today).
3) To be reasonably scalable, most alias analyses which retain state will
need to handle these invalidation events in a way that gracefully degrades
their results rather than wiping out all state.

I think everything in-tree but GlobalsModRef fits these requirements
already. Notably, this will work well for CFL, Andersen, or generally any
"intersection" based AA. A "union" based AA could work here, but would have
to do considerable work to be able to tolerate updates that "invalidate"
its state without simply forced recomputation. I think this is a reasonable
tradeoff as it preserves our ability to experiment, while simplifying the
update model.

The rest of this email contains much more detailed musing on this. Feel
free to stop reading when necessary.

The last section touches on the core functionality that would be needed to
support something like GlobalsModRef in its current form. I'm much less
confident in this part, so it's not part of my core requirements above.

# Details about analysis passes and pass manager invalidation #
My thought is that an alias analysis pass in the new model would be an
analysis pass. It would produce an analysis result that contained any state
necessary to vend AA queries. This result would be cached just like all the
other analysis results are in the pass manager. It would have the same
invalidation rules too.

A really important consequence of this is that analysis results in the new
scheme get notified of invalidation events and can attempt to update their
internal state to *avoid* becoming invalid. AAs can leverage this to do
fast incremental updates to their internal structures if appropriate.

An AA could even use ValueHandles now to automatically stay up-to-date with
a conservatively correct data structure, and just drop all the invalidation
requests. In fact, I suspect many stateful AAs will take exactly this
approach.

# How will users of AA information query it? #
This will work much like the normal analysis queries. I plan to provide an
alias analysis manager at each IR unit that will provide a single query
interface for code to use. This layer can then be configured to query any
AnalysisManager<T> for an IR unit T for the relevant actual AliasAnalysis
implementations.

The AA manager layer will be responsible for the delegation that is
currently done through chaining of AA implementations. It will also be
responsible for enforcing that queries at a particular layer don't violate
IR unit abstraction boundaries. For example, you cannot query a
FuntionAAManager for alias information of two Values from two different
functions.

# Why GlobalsModRef doesn't fit, and what we could do about it #
There is a fundamental construct in GlobalsModRef that exists today, and
seems somewhat useful, but which doesn't fit the above: using a global
analysis of which memory locations are *not* escaped by any function, in
order to conclude that within a given function, arguments, returns, and
loads from globals cannot produce an aliasing memory location. Thus, if you
always directly index or manipulate a global within a function, and it
never escapes that function (including through integer hacks), you know
that some other function can't produce an aliasing location by loading a
pointer from a global or receiving a pointer in an argument.

In *practice* this is actually totally reasonable. Passes don't run around
escaping pointers just because they feel like it. However, we don't have a
good way to really formally reason about such things.

My suggested way to reason about it is to define a new constraint on
transformation passes over an IR unit T (ie, Function, Module, etc.). This
constraint is that a pass may not *observably* escape a pointer value P
from the unit being transformed. Now, "observably" is the key word here.
The goal is to still allow fundamental transformations like Reg2Mem and
instrumentation passes. However, the guarantee must be that the passes do
not introduce new ways to observe an escape of a pointer. Because any
actual escapes cannot be observed, it is safe for them to happen and for
some other function pass (for example) to transform based on the
non-escaping property computed prior to the transformation. I *think* this
is a reasonable model, but I've not really had enough time thinking about
it to be 100% certain.

With such a model, we could regain most of the power of GlobalsModRef
w.r.t. non-escaping pointers and interprocedural aliasing inference. We
would change it to be a two-pronged analysis. On one hand, a local analysis
would have to prove that one location came from a module-level memory
location (global) and the other location came from some dynamic input to
the function (argument, return, or a load of memory that is def'ed outside
the function), and then the other hand would provide the interprocedural
information about what global memory locations were non-escaping.

Ok, those are all the thoughts I have currently about AA, and where I'm
headed. This should hopefully provide context for some of the upcoming
patches.

I'm also going to write a separate email about specific tactical problems
I've found with GlobalsModRef today (nothing to do with the above
theoretical problems) and how I plan to approach them.

Thanks,
-Chandler
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150714/fe52a9da/attachment.html>