[llvm-dev] [RFC] Writing loop transformations on the right representation is more productive

Sat Jan 25 17:30:22 PST 2020

Am Mi., 15. Jan. 2020 um 23:55 Uhr schrieb Renato Golin <rengolin at gmail.com>:
> On Thu, 16 Jan 2020 at 06:27, Chris Lattner via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
> > Ah yes, I see what you mean.  One way to do that is to represent multiple options as an op with region for each option.  This means you only fork the part of the IR that you’re producing variants of.  I think this is the red/green tree technique you mentioned, but I’m not sure.
>
> IIUC. the green tree is a lighter version of the tree (leaner memory
> footprint) but still entire tree. It's ok to lose that info because
> you don't usually need that for your transformations, and you can
> always go back to the red tree (via pointer in the green tree) to ask
> harder questions.

The green tree would contain the "heaver" object as they do not need
to be copied that often. The red tree is necessary for operations
depending on the ancestry/surrounding code. As an example, looking
whether a loop is actually a subloop of an outer loop, which it could
interchanged with.

> Managing the semantics of the two becomes
> non-trivial when you start adding and replacing nodes, there's a point
> where you can't go back anymore to the red tree in the same way.

The model is that the trees are immutable, hence no cost of managing
node replacement. Node only point to their children, their parent (red
tree) of the same code version and potentially the nodes they
originate from. When creating a new version, existing nodes are not
updated.

(to avoid unnecessary new versions, chains of transformations -- like
unroll-then-jam -- may modify existing nodes; if guaranteed that no
reference to them has been passed)
.

> What I referred to as "journalling" is what you propose here. Add
> metadata to the actual graph and during the heuristic search, only
> clone those. If you make sure you can append those nodes to the graph,
> and guarantee that the extra nodes are composable (ie semantically
> valid in any order they may be applied), then having the original
> graph + any intermediate state is valid. Therefore, keeping only the
> extra nodes, and copying them along to try different routes, becomes
> even cheaper than a green tree.
>
> If those extra nodes are an MLIR dialect, with defined semantics and
> structured composition, then using them in an heuristics search
> produces semantically valid intermediate states and lightens the
> burden of proof for every little step.

Journaling (I assume in the sense of filesystemes and databases: keep
a log of items before some change so the change can be rolled back) is
an interesting idea. I think it comes with drawbacks:

 * Only one 'head' version is active at time. Switch to a different
version requires rolling-back one version and re-applying another. No
two versions are active at the same time.

 * Every change/roll-back invalidates analyses (or analyses have to be
taught about journaling and keep previous results in memory)

 * Cannot re-use subtrees (e.g. unrolling referencing the same subtree
factor-times)

 * Potential abuse of the IR (in the sense
https://lists.llvm.org/pipermail/llvm-dev/2017-October/118258.html)

You mentioned intermediate states remaining valid, so you might have a
different kind of journalling in mind. However, due to the strong
coupling (e.g. use-def chains), I don't think this is possible. Please
elaborate more on your idea.

An analogy would be journaling and cow-btree nodes in Btrfs: Their
purposes are quite different.

Michael