[PATCH] D8688: Update MergedLoadStoreMotion to use MemorySSA

Fri Jul 8 14:31:37 PDT 2016

Let me clarify my last comment, since I hit send too early. This was a bug
i already fixed.  Case 1 should also be checking that there are no stores
below. If there are, it should be doing case 2. In essence, the check for
case two should happen before the check for case 1 (since it's not possible
to have stores below without a phi node existing)

This will cause your phi node to get changed to memphi(3, 1), and do the
right thing.

I have a testcase for this.

On Sat, Jul 9, 2016, 7:11 AM Daniel Berlin <dberlin at dberlin.org> wrote:

> On Sat, Jul 9, 2016 at 7:00 AM, Geoff Berry <gberry at codeaurora.org> wrote:
>
>> gberry added a comment.
>>
>> I'm personally fine with committing this as is since it's is disabled,
>> though I don't feel I have the authority to approve it.
>>
>
> FWIW: I've dropped it on the floor because i simply don't have more time
> to work on trying to improve this past the months i've put in already, and
> am going to spend my efforts elsewhere.
>
>
>
>>
>> I do have a couple more comments/questions below, mainly to help me gain
>> confidence that my understanding of MemorySSA is correct.
>>
>
> Happy to help
>
>
>>
>>
>> ================
>> Comment at: lib/Transforms/Scalar/MergedLoadStoreMotion.cpp:799
>> @@ +798,3 @@
>> +
>> +  while (!Res && LookupIter != LookupResult.second) {
>> +    LoadInst *Load0 = LookupIter->first;
>> ----------------
>> dberlin wrote:
>> > gberry wrote:
>> > > Isn't this still O(M^2) if e.g. all of the loads have the same
>> clobbering access and the same types? I guess the limit above controls how
>> big M can get though.
>> > If that was the case, all the loads in that block are equal (because
>> they also passed isSameOperation, etc) and GVN should have removed them
>> already :)
>> >
>> > Note that neither the mssa version or the non-mssa version will handle
>> the case of two identical loads in one side of the diamond, and one load in
>> the other. It will only hoist/merge one of identical ones.
>> Is that right (the bit about all the loads being equal)?
>>
>> I'm thinking of a case like the below where all of the loads pass
>> isSameOperation and all have the call as their clobbering access.  In this
>> case you'll end up comparing all pairs of loads (assuming none of the loads
>> must alias).
>>
>
> Yes. The first thing one would do as an improvement here is also hash the
> memory operand on the theory that if there were two truly identical loads
> in the same block, GVN would have removed them.
>
>
>>
>> call foo
>> br %c, %then, %else
>>
>> then:
>> load i32, %p1
>> load i32, %p2
>> load i32, %p3
>> ...
>> br %end
>>
>> else:
>> load i32, %p4
>> load i32, %p5
>> load i32, %p6
>> ...
>> br %end
>>
>> end:
>>
>> ================
>> Comment at: lib/Transforms/Scalar/MergedLoadStoreMotion.cpp:1011
>> @@ +1010,3 @@
>> +                                             BBNode->getDFSNumOut()};
>> +    // First see if it's outside the dominator tree range
>> +    // The only way it could affect us is if it's below us in the
>> dominator
>> ----------------
>> Do you even need the domtree info here to check for barriers in a
>> diamond?
>
>
> We need to know if there is another store below us in the block.
> The only thing we have is the uses.
> We could walk the rest of the instructions, but it's usually faster to
> check the set of uses, because you don't know how long the block is.
>
>
>
>> There are references to hammocks, but as far as I can tell we never
>> actually attempt to optimize hammocks.  If that is the case, couldn't this
>> just check for barriers in the same block as Start?
>>
>
> The check for barriers is whether there are things in either side of the
> diamond, above or below the load/store (respectively), preventing us from
> hoisting/sinking.
> There are two types of barriers:
>
> 1. The thing we are clobbered by - in the case of loads this is an O(1)
> check, in the case of stores, that is what he loop is doing - seeing where
> the things that clobber the store are.
> 2. unwinding.  LLVM does not explicitly represent the control flow these
> blocks may actually have (GCC, on the other hand, would have an EH edge
> from these blocks), so you have to look.
>
>
>> ================
>> Comment at: lib/Transforms/Scalar/MergedLoadStoreMotion.cpp:1052
>> @@ +1051,3 @@
>> +  //
>> +  // 1. There are no stores above S0 or S1 in either block.
>> +  // In this case, no phi node is necessary, both MemoryDefs  must have
>> the same
>> ----------------
>> Gerolf wrote:
>> > I continue struggling with this function. For now I just need some help
>> understanding the comment.
>> > When you say "above" S0, S1 you refer to DFS numbering like Sx is above
>> S0 in this example:
>> >
>> > 0 Sx
>> > 1
>> > 2 S0
>> >
>> > Correct? This would be consistent with your DFS explanation in a
>> previous explanation.
>> I think there is a bug in the case 1 scenario if there is a pre-existing
>> phi in the bottom of the diamond that references just one of the sunk
>> stores.
>> For example:
>>
>> ; 1 = MemDef ...
>> call foo
>> br %c %then, %else
>>
>> then:
>> ; 2 = MemDef(1)
>> store @A
>> ; 3 = MemDef(2)
>> store @B
>> br %end
>>
>> else:
>> ; 4 = MemDef(1)
>> store @A
>> br %end
>>
>> end:
>> 5 = MemPhi(3, 4)
>>
>> I believe in this case, after updating the end block will look like:
>>
>> end:
>> 5 = MemPhi(3, 6)
>> ; 6 = MemDef(1)
>> store @A
>>
>>
> I believe there is a testcase that tests this.
>
>> Which seems wrong, since the phi is before it's use, but also it seems
>> like having 6's defining access skip over the phi 5 and def 3 could cause
>> trouble, though I'm less sure about the latter.
>>
>>
>> http://reviews.llvm.org/D8688
>>
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160708/aae8b182/attachment.html>