[llvm] r278875 - [Docs] Add initial MemorySSA documentation.

Tue Aug 16 23:52:18 PDT 2016

Thanks a lot for writing this up, George!

I haven't been following the MemorySSA discussions, and I've been waiting
for somebody to write up a doc for the implementation so I can come up to
speed.
So, some comments from the perspective of a total MemorySSA noob (that is,
the intended audience of this document :-) follow.

On 16 August 2016 at 17:17, George Burgess IV via llvm-commits <
llvm-commits at lists.llvm.org> wrote:

> Author: gbiv
> Date: Tue Aug 16 19:17:29 2016
> New Revision: 278875
>
> URL: http://llvm.org/viewvc/llvm-project?rev=278875&view=rev
> Log:
> [Docs] Add initial MemorySSA documentation.
>
> Patch partially by Danny.
>
> Differential Revision: https://reviews.llvm.org/D23535
>
> Added:
>     llvm/trunk/docs/MemorySSA.rst
> Modified:
>     llvm/trunk/docs/AliasAnalysis.rst
>     llvm/trunk/docs/index.rst
>
> Modified: llvm/trunk/docs/AliasAnalysis.rst
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/
> AliasAnalysis.rst?rev=278875&r1=278874&r2=278875&view=diff
> ============================================================
> ==================
> --- llvm/trunk/docs/AliasAnalysis.rst (original)
> +++ llvm/trunk/docs/AliasAnalysis.rst Tue Aug 16 19:17:29 2016
> @@ -702,6 +702,12 @@ algorithm will have a lower number of ma
>  Memory Dependence Analysis
>  ==========================
>
> +.. note::
> +
> +  We are currently in the process of migrating things from
> +  ``MemoryDependenceAnalysis`` to :doc:`MemorySSA`. Please try to use
> +  that instead.
> +
>  If you're just looking to be a client of alias analysis information,
> consider
>  using the Memory Dependence Analysis interface instead.  MemDep is a lazy,
>  caching layer on top of alias analysis that is able to answer the
> question of
>
> Added: llvm/trunk/docs/MemorySSA.rst
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/
> MemorySSA.rst?rev=278875&view=auto
> ============================================================
> ==================
> --- llvm/trunk/docs/MemorySSA.rst (added)
> +++ llvm/trunk/docs/MemorySSA.rst Tue Aug 16 19:17:29 2016
> @@ -0,0 +1,358 @@
> +=========
> +MemorySSA
> +=========
> +
> +.. contents::
> +   :local:
> +
> +Introduction
> +============
> +
> +``MemorySSA`` is an analysis that allows us to cheaply reason about the
> +interactions between various memory operations. Its goal is to replace
> +``MemoryDependenceAnalysis`` for most (if not all) use-cases. This is
> because,
> +unless you're very careful, use of ``MemoryDependenceAnalysis`` can easily
> +result in quadratic-time algorithms in LLVM. Additionally, ``MemorySSA``
> doesn't
> +have as many arbitrary limits as ``MemoryDependenceAnalysis``, so you
> should get
> +better results, too.
>

Replacing AliasSetTracker is also a goal, right?

> +
> +At a high level, one of the goals of ``MemorySSA`` is to provide an SSA
> based
> +form for memory, complete with def-use and use-def chains, which
> +enables users to quickly find may-def and may-uses of memory operations.
> +It can also be thought of as a way to cheaply give versions to the
> complete
> +state of heap memory, and associate memory operations with those versions.
> +
> +This document goes over how ``MemorySSA`` is structured, and some basic
> +intuition on how ``MemorySSA`` works.
> +
> +A paper on MemorySSA (with notes about how it's implemented in GCC) `can
> be
> +found here <http://www.airs.com/dnovillo/Papers/mem-ssa.pdf>`_. Though,
> it's
> +relatively out-of-date; the paper references multiple heap partitions,
> but GCC
> +eventually swapped to just using one, like we now have in LLVM.  Like
> +GCC's, LLVM's MemorySSA is intraprocedural.
> +
> +
> +MemorySSA Structure
> +===================
> +
> +MemorySSA is a virtual IR. After it's built, ``MemorySSA`` will contain a
> +structure that maps ``Instruction`` s to ``MemoryAccess`` es, which are
> +``MemorySSA``'s parallel to LLVM ``Instruction`` s.
> +
> +Each ``MemoryAccess`` can be one of three types:
> +
> +- ``MemoryPhi``
> +- ``MemoryUse``
> +- ``MemoryDef``
> +
> +``MemoryPhi`` s are ``PhiNode`` , but for memory operations. If at any
>

All of the ``Foo`` s don't render correctly, unfortunately.

> +point we have two (or more) ``MemoryDef`` s that could flow into a
> +``BasicBlock``, the block's top ``MemoryAccess`` will be a
> +``MemoryPhi``. As in LLVM IR, ``MemoryPhi`` s don't correspond to any
> +concrete operation. As such, you can't look up a ``MemoryPhi`` with an
> +``Instruction`` (though we do allow you to do so with a
> +``BasicBlock``).

It's not entirely clear to me what the last sentence means - I didn't
understand what you can and can't look up, and how.

> +
> +Note also that in SSA, Phi nodes merge must-reach definitions (that
> +is, definite new versions of variables).  In MemorySSA, PHI nodes merge
>

definite -> define

> +may-reach definitions (that is, until disambiguated, the versions that
> +reach a phi node may or may not clobber a given variable)
> +
> +``MemoryUse`` s are operations which use but don't modify memory. An
> example of
> +a ``MemoryUse`` is a ``load``, or a ``readonly`` function call.
> +
> +``MemoryDef`` s are operations which may either modify memory, or which
> +otherwise clobber memory in unquantifiable ways. Examples of
> ``MemoryDef`` s
>

unquantifiable? :-)

> +include ``store`` s, function calls, ``load`` s with ``acquire`` (or
> higher)
> +ordering, volatile operations, memory fences, etc.
>
>
It would probably be good to explain why non-relaxed atomics, volatiles and
fences count as defs. It's true that they tend to be optimization barriers,
but saying they "clobber" memory sounds fishy.

> +Every function that exists has a special ``MemoryDef`` called
> ``liveOnEntry``.
> +It dominates every ``MemoryAccess`` in the function that ``MemorySSA`` is
> being
> +run on, and implies that we've hit the top of the function. It's the only
> +``MemoryDef`` that maps to no ``Instruction`` in LLVM IR. Use of
> +``liveOnEntry`` implies that the memory being used is either undefined or
> +defined before the function begins.
> +
> +An example of all of this overlayed on LLVM IR (obtained by running ``opt
> +-passes='print<memoryssa>' -disable-output`` on an ``.ll`` file) is
> below. When
> +viewing this example, it may be helpful to view it in terms of clobbers.
> The
> +operands of a given ``MemoryAccess`` are all (potential) clobbers of said
> +MemoryAccess, and the value produced by a ``MemoryAccess`` can act as a
> clobber
> +for other ``MemoryAccess`` es. Another useful way of looking at it is in
> +terms of heap versions.  In that view, operands of of a given
> +``MemoryAccess`` are the version of the heap before the operation, and
> +if the access produces a value, the value is the new version of the heap
> +after the operation.
> +
> +.. code-block:: llvm
> +
> +  define void @foo() {
> +  entry:
> +    %p1 = alloca i8
> +    %p2 = alloca i8
> +    %p3 = alloca i8
> +    ; 1 = MemoryDef(liveOnEntry)
> +    store i8 0, i8* %p3
> +    br label %while.cond
> +
> +  while.cond:
> +    ; 6 = MemoryPhi({%0,1},{if.end,4})
> +    br i1 undef, label %if.then, label %if.else
> +
> +  if.then:
> +    ; 2 = MemoryDef(6)
> +    store i8 0, i8* %p1
> +    br label %if.end
> +
> +  if.else:
> +    ; 3 = MemoryDef(6)
> +    store i8 1, i8* %p2
> +    br label %if.end
> +
> +  if.end:
> +    ; 5 = MemoryPhi({if.then,2},{if.then,3})
> +    ; MemoryUse(5)
> +    %1 = load i8, i8* %p1
> +    ; 4 = MemoryDef(5)
> +    store i8 2, i8* %p2
> +    ; MemoryUse(1)
> +    %2 = load i8, i8* %p3
> +    br label %while.cond
> +  }
> +
> +The ``MemorySSA`` IR is located comments that precede the instructions
> they map
> +to (if such an instruction exists). For example, ``1 =
> MemoryDef(liveOnEntry)``
> +is a ``MemoryAccess`` (specifically, a ``MemoryDef``), and it describes
> the LLVM
> +instruction ``store i8 0, i8* %p3``. Other places in ``MemorySSA`` refer
> to this
> +particular ``MemoryDef`` as ``1`` (much like how one can refer to ``load
> i8, i8*
> +%p1`` in LLVM with ``%1``). Again, ``MemoryPhi`` s don't correspond to
> any LLVM
> +Instruction, so the line directly below a ``MemoryPhi`` isn't special.
> +
> +Going from the top down:
> +
> +- ``6 = MemoryPhi({%0,1},{if.end,4})`` notes that, when entering
> ``while.cond``,
> +  the reaching definition for it is either ``1`` or ``4``. This
> ``MemoryPhi`` is
> +  referred to in the textual IR by the number ``6``.
> +- ``2 = MemoryDef(6)`` notes that ``store i8 0, i8* %p1`` is a definition,
> +  and its reaching definition before it is ``6``, or the ``MemoryPhi``
> after
> +  ``while.cond``.
>

It's not really clear why this is the case, even though 2 is the only store
to %p1.
Naively, looking at it from the "clobbering" perspective, I'd expect them
to be on separate chains, and to have another phi at the entry to
while.cond - something like
; 7 = MemoryPhi({%0, liveOnEntry},{if.end, 2})
...
; 2 = MemoryDef(7)

One option is that queries that depend on alias analysis are left entirely
to the walker - but then it's not clear why load %2 MemoryUses(1), rather
than 6.
I looked the description up in MemorySSA.h and it mentions we intentionally
choose not to disambiguate defs. Assuming this is a consequence of that
choice, documenting it here as well would clear things up.

(Ok, I got to the "Use optimization" part, and it explains this, but I
think the order of presentation makes the whole thing somewhat confusing.
It may be better to either prefetch the "use optimization" discussion, or
first show an unoptimized form, and then show an optimized one later.
Although that may also be rather confusing...)

> +- ``3 = MemoryDef(6)`` notes that ``store i8 0, i8* %p2`` is a
> definition; its
> +  reaching definition is also ``6``.
> +- ``5 = MemoryPhi({if.then,2},{if.then,3})`` notes that the clobber
> before
> +  this block could either be ``2`` or ``3``.
> +- ``MemoryUse(5)`` notes that ``load i8, i8* %p1`` is a use of memory,
> and that
> +  it's clobbered by ``5``.
> +- ``4 = MemoryDef(5)`` notes that ``store i8 2, i8* %p2`` is a
> definition; it's
> +  reaching definition is ``5``.
> +- ``MemoryUse(1)`` notes that ``load i8, i8* %p3`` is just a user of
> memory,
> +  and the last thing that could clobber this use is above ``while.cond``
> (e.g.
> +  the store to ``%p3``).  In heap versioning parlance, it really
> +  only depends on the heap version 1, and is unaffected by the new
> +  heap versions generated since then.
> +
> +As an aside, ``MemoryAccess`` is a ``Value`` mostly for convenience; it's
> not
> +meant to interact with LLVM IR.
> +
> +Design of MemorySSA
> +===================
> +
> +``MemorySSA`` is an analysis that can be built for any arbitrary
> function. When
> +it's built, it does a pass over the function's IR in order to build up its
> +mapping of ``MemoryAccess`` es. You can then query ``MemorySSA`` for
> things like
> +the dominance relation between ``MemoryAccess`` es, and get the
> ``MemoryAccess``
> +for any given ``Instruction`` .
> +
> +When ``MemorySSA`` is done building, it also hands you a
> ``MemorySSAWalker``
> +that you can use (see below).
> +
> +
> +The walker
> +----------
> +
> +A structure that helps ``MemorySSA`` do its job is the
> ``MemorySSAWalker``, or
> +the walker, for short. The goal of the walker is to provide answers to
> clobber
> +queries beyond what's represented directly by ``MemoryAccess`` es. For
> example,
> +given:
> +
> +.. code-block:: llvm
> +
> +  define void @foo() {
> +    %a = alloca i8
> +    %b = alloca i8
> +
> +    ; 1 = MemoryDef(liveOnEntry)
> +    store i8 0, i8* %a
> +    ; 2 = MemoryDef(1)
> +    store i8 0, i8* %b
> +  }
> +
> +The store to ``%a`` is clearly not a clobber for the store to ``%b``. It
> would
> +be the walker's goal to figure this out, and return ``liveOnEntry`` when
> queried
> +for the clobber of ``MemoryAccess`` ``2``.
> +
> +By default, ``MemorySSA`` provides a walker that can optimize
> ``MemoryDef`` s
> +and ``MemoryUse`` s by consulting alias analysis. Walkers were built to be
> +flexible, though, so it's entirely reasonable (and expected) to create
> more
> +specialized walkers (e.g. one that queries ``GlobalsAA``).
>
>
How is querying GlobalsAA different from "consulting alias analysis"?
Did you mean that the default walker queries some pre-defined AA, but you
can create a walker that queries a different one?

> +
> +Locating clobbers yourself
> +^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +If you choose to make your own walker, you can find the clobber for a
> +``MemoryAccess`` by walking every ``MemoryDef`` that dominates said
> +``MemoryAccess``. The structure of ``MemoryDef`` s makes this relatively
> simple;
> +they ultimately form a linked list of every clobber that dominates the
> +``MemoryAccess`` that you're trying to optimize. In other words, the
> +``definingAccess`` of a ``MemoryDef`` is always the nearest dominating
> +``MemoryDef`` or ``MemoryPhi`` of said ``MemoryDef``.
> +
> +
> +Use optimization
> +----------------
> +
> +``MemorySSA`` will optimize some ``MemoryAccess`` es at build-time.
> +Specifically, we optimize the operand of every ``MemoryUse`` s to point
> to the
> +actual clobber of said ``MemoryUse``. This can be seen in the above
> example; the
> +second ``MemoryUse`` in ``if.end`` has an operand of ``1``, which is a
> +``MemoryDef`` from the entry block.  This is done to make walking,
> +value numbering, etc, faster and easier.
> +It is not possible to optimize ``MemoryDef`` in the same way, as we
> +restrict ``MemorySSA`` to one heap variable and, thus, one Phi node
> +per block.
> +
> +
> +Invalidation and updating
> +-------------------------
> +
> +Because ``MemorySSA`` keeps track of LLVM IR, it needs to be updated
> whenever
> +the IR is updated. "Update", in this case, includes the addition,
> deletion, and
> +motion of IR instructions. The update API is being made on an as-needed
> basis.
> +
> +
> +Phi placement
> +^^^^^^^^^^^^^
> +
> +``MemorySSA`` only places ``MemoryPhi`` s where they're actually
> +needed. That is, it is a pruned SSA form, like LLVM's SSA form.  For
> +example, consider:
> +
> +.. code-block:: llvm
> +
> +  define void @foo() {
> +  entry:
> +    %p1 = alloca i8
> +    %p2 = alloca i8
> +    %p3 = alloca i8
> +    ; 1 = MemoryDef(liveOnEntry)
> +    store i8 0, i8* %p3
> +    br label %while.cond
> +
> +  while.cond:
> +    ; 3 = MemoryPhi({%0,1},{if.end,2})
> +    br i1 undef, label %if.then, label %if.else
> +
> +  if.then:
> +    br label %if.end
> +
> +  if.else:
> +    br label %if.end
> +
> +  if.end:
> +    ; MemoryUse(1)
> +    %1 = load i8, i8* %p1
> +    ; 2 = MemoryDef(3)
> +    store i8 2, i8* %p2
> +    ; MemoryUse(1)
> +    %2 = load i8, i8* %p3
> +    br label %while.cond
> +  }
> +
> +Because we removed the stores from ``if.then`` and ``if.else``, a
> ``MemoryPhi``
> +for ``if.end`` would be pointless, so we don't place one. So, if you need
> to
> +place a ``MemoryDef`` in ``if.then`` or ``if.else``, you'll need to also
> create
> +a ``MemoryPhi`` for ``if.end``.
> +
> +If it turns out that this is a large burden, we can just place
> ``MemoryPhi`` s
> +everywhere. Because we have Walkers that are capable of optimizing above
> said
> +phis, doing so shouldn't prohibit optimizations.
> +
> +
> +Non-Goals
> +---------
> +
> +``MemorySSA`` is meant to reason about the relation between memory
> +operations, and enable quicker querying.
> +It isn't meant to be the single source of truth for all potential
> memory-related
> +optimizations. Specifically, care must be taken when trying to use
> ``MemorySSA``
> +to reason about atomic or volatile operations, as in:
> +
> +.. code-block:: llvm
> +
> +  define i8 @foo(i8* %a) {
> +  entry:
> +    br i1 undef, label %if.then, label %if.end
> +
> +  if.then:
> +    ; 1 = MemoryDef(liveOnEntry)
> +    %0 = load volatile i8, i8* %a
> +    br label %if.end
> +
> +  if.end:
> +    %av = phi i8 [0, %entry], [%0, %if.then]
> +    ret i8 %av
> +  }
> +
> +Going solely by ``MemorySSA``'s analysis, hoisting the ``load`` to
> ``entry`` may
> +seem legal. Because it's a volatile load, though, it's not.
> +
> +
> +Design tradeoffs
> +----------------
> +
> +Precision
> +^^^^^^^^^
> +``MemorySSA`` in LLVM deliberately trades off precision for speed.
> +Let us think about memory variables as if they were disjoint partitions
> of the
> +heap (that is, if you have one variable, as above, it represents the
> entire
> +heap, and if you have multiple variables, each one represents some
> +disjoint portion of the heap)
> +
> +First, because alias analysis results conflict with each other, and
> +each result may be what an analysis wants (IE
> +TBAA may say no-alias, and something else may say must-alias), it is
> +not possible to partition the heap the way every optimization wants.
>

I think the start and the end of this sentence are orthogonal.
It's true that different optimizations may want different levels of
precision, but I don't think must-alias/no-alias conflicts are a good
motivations. Ideally, two correct alias analyses should not return
conflicting results. The idea behind the old AA stack is that we have a
lattice where "May < No" and "May < Must", and going through the stack only
moves you upwards. TBAA is a special case, because we sort-of "ignore" TBAA
when not in strict-aliasing mode, but I think, conceptually, the right way
to look at this is that w/o strict-aliasing, the TBAA no-alias is "wrong".

> +Second, some alias analysis results are not transitive (IE A noalias B,
> +and B noalias C, does not mean A noalias C), so it is not possible to
> +come up with a precise partitioning in all cases without variables to
> +represent every pair of possible aliases.  Thus, partitioning
> +precisely may require introducing at least N^2 new virtual variables,
> +phi nodes, etc.
> +
> +Each of these variables may be clobbered at multiple def sites.
> +
> +To give an example, if you were to split up struct fields into
> +individual variables, all aliasing operations that may-def multiple struct
> +fields, will may-def more than one of them.  This is pretty common (calls,
> +copies, field stores, etc).
> +
> +Experience with SSA forms for memory in other compilers has shown that
> +it is simply not possible to do this precisely, and in fact, doing it
> +precisely is not worth it, because now all the optimizations have to
> +walk tons and tons of virtual variables and phi nodes.
> +
> +So we partition.  At the point at which you partition, again,
> +experience has shown us there is no point in partitioning to more than
> +one variable.  It simply generates more IR, and optimizations still
> +have to query something to disambiguate further anyway.
> +
> +As a result, LLVM partitions to one variable.
> +
> +Use Optimization
> +^^^^^^^^^^^^^^^^
> +
> +Unlike other partitioned forms, LLVM's ``MemorySSA`` does make one
> +useful guarantee - all loads are optimized to point at the thing that
> +actually clobbers them. This gives some nice properties.  For example,
> +for a given store, you can find all loads actually clobbered by that
> +store by walking the immediate uses of the store.
>
> Modified: llvm/trunk/docs/index.rst
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/index.
> rst?rev=278875&r1=278874&r2=278875&view=diff
> ============================================================
> ==================
> --- llvm/trunk/docs/index.rst (original)
> +++ llvm/trunk/docs/index.rst Tue Aug 16 19:17:29 2016
> @@ -235,6 +235,7 @@ For API clients and LLVM developers.
>     :hidden:
>
>     AliasAnalysis
> +   MemorySSA
>     BitCodeFormat
>     BlockFrequencyTerminology
>     BranchWeightMetadata
> @@ -291,6 +292,9 @@ For API clients and LLVM developers.
>     Information on how to write a new alias analysis implementation or how
> to
>     use existing analyses.
>
> +:doc:`MemorySSA`
> +   Information about the MemorySSA utility in LLVM, as well as how to use
> it.
> +
>  :doc:`GarbageCollection`
>     The interfaces source-language compilers should use for compiling GC'd
>     programs.
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160816/dcb1f313/attachment.html>