[PATCH] Rename determineInsertionPoints to determinePHINodes, expose it so it can be reused
Quentin Colombet
qcolombet at apple.com
Mon Apr 20 17:30:29 PDT 2015
Hi Daniel,
I like the direction this is taking. A few comments inlined.
Cheers,
-Quentin
> On Apr 20, 2015, at 4:47 PM, Daniel Berlin <dberlin at dberlin.org> wrote:
>
>>
>> That is what I meant sorry.
>> A list of all the uses and defs related to the “variable” we want to SSAify.
>> And I mean instructions, not block. We can drop the blocks.
>
> So, two concerns:
>
> 1. This will not work for the other current user. MemorySSA's
> defs/uses are not descended from Instruction (or even Value), because
> it is not directly part of the IR. We have nothing like Go interfaces,
> where we could say "you can pass in anything you can call getParent
> on".
>
> Let me take a step back and show you what i am trying to accomplish.
> If you look at memoryssa, over at http://reviews.llvm.org/D7864, you
> can see MemorySSA.h defines simple def/use/phi (this was the broad
> consensus on what to do - not try to make this part of the IR)
> classes.
>
> If you look at MemorySSA.cpp, it uses a version of
> determineInsertionPoint to do it's phi insertion (it does not do
> live-in calculation, because it's not needed, interestingly enough,
> leading me to suggest a compromise below :P)
I believe that could work, but is not worth the burden on the MemorySSA class.
>
> Hence, my refactoring has been about "making this algorithm only care
> about the CFG", which leads to
>
> 2. This transforms the algorithm from something that depends only on
> local CFG properties, to one that depends on instruction level IR and
> knowing things about it. I have to admit i don't see the appreciable
> benefit we get from doing this.
>
>
>>
>> the basic blocks they belong to,
>>
>>
>> getParent(), no need for extra info ;).
>>
>> and
>> their relative positions (so it knows if def happens before use in a
>> given block).
>>
>>
>> This can be computed: is InstA dominates InstB. Currently I believe this
>> check is expensive, but that would be a good use case of improving that :).
>>
>
> Same problem here (dt->dominates will not work on memoryssa defs/uses).
> Ignoring that, improving it in the current situation is not possible
> without having a dominators API that can do instruction-level
> invalidation.
>
> If BasicBlock's stored arrays of actual instructions instead of lists
> (IE it could guarantee that pointer to def < pointer to use -> def
> comes before use) , it would be possible.
Well, we currently do it by just walking the list of instructions to see which one comes first.
That’s not efficient, but that is possible :).
Anyhow, let just give up on that.
> But otherwise, in our current scheme of who owns what, Dominators
> would have to to store a local numbering of instructions so it could
> be reused. While you can play games here to make this not have to
> update as often (if you are willing to limit the number of
> instructions per block to 40 million, you can skip-number the local
> instructions by 100 so that you can insert 100 instructions in between
> them and not need to recalculate everything), you can't avoid updating
> at all, and you *need* to know when new instructions have been
> inserted to maintain the ordering, because you need to know the second
> you need to recalculate, or dominates gives wrong answers.
>
> Unless i'm missing something?
>
> Don't get me wrong, i think having such a thing would be really cool
> and useful for other reasons, and i'm in fact, happy to run around
> doing the work if we like. But it seems like a pretty major change
> that would require a bunch of thinking and discussion, because i'm
> pretty sure if our answer becomes "we should be able to be O(1) for
> dominates(def, use)", the answer is "we should incorporate that info
> into the basic blocks, so that dominates doesn't have to observe every
> instruction change" (or something similar that does not require
> modifying everything in LLVM that inserts instructions :P)
Agreed. It’s been a while since I had envisioned this enhancement, but so far I admit nothing proved worth doing it.
>
>
> Note also that the way computeLiveIn works right now is to stop
> processing each block when it's determined whether it's a defbeforeuse
> or a using block. This requires only looking at the first def *or*
> first use of a variable in each block. Your dominates mechanism
> would take significantly more time than this, as it would require
> touching *every def or use of the variable* at least once, in order to
> put it in order.
>
> It would look like this currently:
>
> sort defs and uses into block
> sort uses by order in their block
> sort defs by order in their block
>
> Use current livein calculation algorithm.
>
> (IE touch everything, then do what we do now)
>
> The current algorithm is
>
> Live-in blocks = all using blocks.
>
> for each using block:
> if it's not a defining block, continue
>
> for each instruction in block
> if i see use of variable:
> continue to next block, this block is live in
> if i see def of variable:
> remove block from live in, continue to next block.
>
> <dataflow calculation to propagate>
>
> As you can see, this algorithm touches less things.
>
> (For comparison, given defbeforeuse and usingblocks, it's *only* a
> datatflow calculation with no IR walking in the live in computation)
>
> You can get around this if you require the defs/uses be given to us in
> sorted order. But that is the same as calculating defbeforeuse and
> using blocks :)
>
>
>>
>> I think the list of def / use is easier to get from the user perspective and
>> let the algo handles all the gory details.
>>
>> What about giving more granularity in the API?
>
> So, let me propose we have an end-result API that looks something like this:
>
> (Let's call it IDFComputation instead of determineInsertionPoints or
> determinePHIPlacement. It doesn't actually do insertion, just Iterated
> Dominance Frontier calculation. The caller inserts phi nodes at the
> IDF points):
>
>
>
> class IDFComputation {
>
> // This constructor takes the minimum needed to do a correct
> computation without livein
> IDFComputation(DT, DefiningBlocks); // defaults to uselivein=false
>
> void setLiveIn(DT, <blocks or lists of uses depending on the above>);
> // Computes livein
> void setNoLiveIn(); // Changes uselivein to false, frees whatever
> livein info exists
I do not see the benefit in having the ability to change the LiveIn instead of being part of the constructor.
Could you elaborate on the use case you have in mind.
>
> void calculate(IDFBlocks as output parameter). // calculate the IDF.
> Internally computes livein, domlevels and bbnumbers if necessary.
Return value instead of output parameter?
For RVO.
>
> private:
> bool uselivein;
> DenseMap DomLevels;
> DenseMap BBNumbers;
> DominatorTree DT
> }
>
>
> The reasoning for this API is that
> 1. We have users who do not care about livein (it's only necessary to
> compute pruned SSA). MemorySSA does not care to compute live in, nor
> do most users (if you wanted to use this to compute SSI or whatever).
> It's only useful to compute pruned SSA.
> 2. We can cache the result of most of calculate's internal
> computations if the set of livein blocks and the CFG does not change
> between computations.
But since we cannot change the defining blocks, I guess there is not much to cache between two different users, isn’t?
>
> #2 is clearly trivially true for things that don't care about live-in :)
>
>
>
>>
>> We could expose a DT, Defs<Inst>, Uses<Inst>, like API.
>> But also the DT, Defs<Block>, LiveInBlock, API with some helpers to compute
>> the different information, like LiveInBlock, DomLevel etc.
>>
>> We can also envision a version that actually updates the uses to the proper
>> inserted PHIs.
>>
>> I may go to far in the refactoring as maybe I missed the short term goal.
>>
>> The bottom line is I do not want to hold on the commit if it suits whatever
>> the short term usages are, if other people agrees. We can rework that as the
>> new users are exposed :).
>>
>> Cheers,
>> -Quentin
>>
More information about the llvm-commits
mailing list