[PATCH] Rename determineInsertionPoints to determinePHINodes, expose it so it can be reused

Mon Apr 20 16:47:11 PDT 2015

>
> That is what I meant sorry.
> A list of all the uses and defs related to the “variable” we want to SSAify.
> And I mean instructions, not block. We can drop the blocks.

So, two concerns:

1. This will not work for the other current user.  MemorySSA's
defs/uses are not descended from Instruction (or even Value), because
it is not directly part of the IR. We have nothing like Go interfaces,
where we could say "you can pass in anything you can call getParent
on".

Let me take a step back and show you what i am trying to accomplish.
If you look at memoryssa, over at http://reviews.llvm.org/D7864, you
can see MemorySSA.h defines simple def/use/phi (this was the broad
consensus on what to do - not try to make this part of the IR)
classes.

If you look at MemorySSA.cpp, it uses a version of
determineInsertionPoint to do it's phi insertion (it does not do
live-in calculation, because it's not needed, interestingly enough,
leading me to suggest a compromise below :P)

Hence, my refactoring has been about "making this algorithm only care
about the CFG", which leads to

2.  This transforms the algorithm from something that depends only on
local CFG properties, to one that depends on instruction level IR and
knowing things about it.  I have to admit i don't see the appreciable
benefit we get from doing this.

>
> the basic blocks they belong to,
>
>
> getParent(), no need for extra info ;).
>
> and
> their relative positions (so it knows if def happens before use in a
> given block).
>
>
> This can be computed: is InstA dominates InstB. Currently I believe this
> check is expensive, but that would be a good use case of improving that :).
>

Same problem here (dt->dominates will not work on memoryssa defs/uses).
Ignoring that, improving it  in the current situation is not possible
without having a dominators API that can do instruction-level
invalidation.

If BasicBlock's stored arrays of actual instructions instead of lists
(IE it could guarantee that pointer to def < pointer to use -> def
comes before use) , it would be possible.
But otherwise, in our current scheme of who owns what, Dominators
would have to to store a local numbering of instructions so it could
be reused. While you can play games here to make this not have to
update as often (if you are willing to limit the number of
instructions per block to 40 million, you can skip-number the local
instructions by 100 so that you can insert 100 instructions in between
them and not need to recalculate everything), you can't avoid updating
at all, and you *need* to know when new instructions have been
inserted to maintain the ordering, because you need to know the second
you need to recalculate, or dominates gives wrong answers.

Unless i'm missing something?

Don't get me wrong, i think having such a thing would be really cool
and useful for other reasons, and i'm in fact, happy to run around
doing the work if we like.  But it seems like a pretty major change
that would require a bunch of thinking and discussion, because i'm
pretty sure if our answer becomes "we should be able to be O(1) for
dominates(def, use)", the answer is "we should incorporate that info
into the basic blocks, so that dominates doesn't have to observe every
instruction change" (or something similar that does not require
modifying everything in LLVM that inserts instructions :P)

Note also that the way computeLiveIn works right now is to stop
processing each block when it's determined whether it's a defbeforeuse
or a using block.  This requires only looking at the first def *or*
first use of a variable in each block.    Your dominates mechanism
would take significantly more time than this, as it would require
touching *every def or use of the variable* at least once, in order to
put it in order.

It would look like this currently:

sort defs and uses into block
sort uses by order in their block
sort defs by order in their block

Use current livein calculation algorithm.

(IE touch everything, then do what we do now)

The current algorithm is

Live-in blocks = all using blocks.

for each using block:
  if it's not a defining block, continue

  for each instruction in block
    if i see use of variable:
       continue to next block, this block is live in
    if i see def of variable:
       remove block from live in, continue to next block.

<dataflow calculation to propagate>

As you can see, this algorithm touches less things.

(For comparison, given defbeforeuse and usingblocks, it's *only* a
datatflow calculation with no IR walking in the live in computation)

You can get around this if you require the defs/uses be given to us in
sorted order.  But that is the same as calculating defbeforeuse and
using blocks :)

>
> I think the list of def / use is easier to get from the user perspective and
> let the algo handles all the gory details.
>
> What about giving more granularity in the API?

So, let me propose we have an end-result API that looks something like this:

(Let's call it IDFComputation instead of determineInsertionPoints or
determinePHIPlacement. It doesn't actually do insertion, just Iterated
Dominance Frontier calculation. The caller inserts phi nodes at the
IDF points):

class IDFComputation {

// This constructor takes the minimum needed to do a correct
computation without livein
IDFComputation(DT, DefiningBlocks); // defaults to uselivein=false

void setLiveIn(DT, <blocks or lists of uses depending on the above>);
// Computes livein
void setNoLiveIn();  // Changes uselivein to false, frees whatever
livein info  exists

void calculate(IDFBlocks as output parameter).  // calculate the IDF.
Internally computes livein, domlevels and bbnumbers if necessary.

private:
bool uselivein;
DenseMap DomLevels;
DenseMap BBNumbers;
DominatorTree DT
}

The reasoning for this API is that
1. We have users who do not care about livein (it's only necessary to
compute pruned SSA).  MemorySSA does not care to compute live in, nor
do most users (if you wanted to use this to compute SSI or whatever).
It's only useful to compute pruned SSA.
2. We can cache the result of most of calculate's internal
computations if the set of livein blocks and the CFG does not change
between computations.

#2 is clearly trivially true for things that don't care about live-in :)

>
> We could expose a DT, Defs<Inst>, Uses<Inst>, like API.
> But also the DT, Defs<Block>, LiveInBlock, API with some helpers to compute
> the different information, like LiveInBlock, DomLevel etc.
>
> We can also envision a version that actually updates the uses to the proper
> inserted PHIs.
>
> I may go to far in the refactoring as maybe I missed the short term goal.
>
> The bottom line is I do not want to hold on the commit if it suits whatever
> the short term usages are, if other people agrees. We can rework that as the
> new users are exposed :).
>
> Cheers,
> -Quentin
>