<div dir="ltr">Let me clarify my <span style="font-size:13px">last comment, since I hit send too early. This was a bug i already fixed. Case 1 should also be checking that there are no stores below. If there are, it should be doing case 2. In essence, the check for case two should happen before the check for case 1 (since it's not possible to have stores below without a phi node existing)</span></div><div dir="ltr"><span style="font-size:13px"><br></span></div><div dir="ltr"><span style="font-size:13px">This will cause your phi node to get changed to memphi(3, 1), and do the right thing.</span></div><div dir="ltr"><span style="font-size:13px"><br></span></div><div dir="ltr"><span style="font-size:13px">I have a testcase for this.</span></div><div dir="ltr"><span style="font-size:13px"><br></span></div><span>
</span><br><div class="gmail_quote"><div dir="ltr">On Sat, Jul 9, 2016, 7:11 AM Daniel Berlin <<a href="mailto:dberlin@dberlin.org" target="_blank">dberlin@dberlin.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Sat, Jul 9, 2016 at 7:00 AM, Geoff Berry <span dir="ltr"><<a href="mailto:gberry@codeaurora.org" target="_blank">gberry@codeaurora.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">gberry added a comment.<br>
<br>
I'm personally fine with committing this as is since it's is disabled, though I don't feel I have the authority to approve it.<br></blockquote><div><br></div></div></div></div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>FWIW: I've dropped it on the floor because i simply don't have more time to work on trying to improve this past the months i've put in already, and am going to spend my efforts elsewhere.</div></div></div></div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
I do have a couple more comments/questions below, mainly to help me gain confidence that my understanding of MemorySSA is correct.<br></blockquote><div><br></div></div></div></div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>Happy to help</div></div></div></div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<span><br>
<br>
================<br>
Comment at: lib/Transforms/Scalar/MergedLoadStoreMotion.cpp:799<br>
</span>@@ +798,3 @@<br>
<span>+<br>
+ while (!Res && LookupIter != LookupResult.second) {<br>
</span><span>+ LoadInst *Load0 = LookupIter->first;<br>
----------------<br>
</span><span>dberlin wrote:<br>
> gberry wrote:<br>
> > Isn't this still O(M^2) if e.g. all of the loads have the same clobbering access and the same types? I guess the limit above controls how big M can get though.<br>
> If that was the case, all the loads in that block are equal (because they also passed isSameOperation, etc) and GVN should have removed them already :)<br>
><br>
> Note that neither the mssa version or the non-mssa version will handle the case of two identical loads in one side of the diamond, and one load in the other. It will only hoist/merge one of identical ones.<br>
</span>Is that right (the bit about all the loads being equal)?<br>
<br>
I'm thinking of a case like the below where all of the loads pass isSameOperation and all have the call as their clobbering access. In this case you'll end up comparing all pairs of loads (assuming none of the loads must alias).<br></blockquote><div><br></div></div></div></div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>Yes. The first thing one would do as an improvement here is also hash the memory operand on the theory that if there were two truly identical loads in the same block, GVN would have removed them.</div></div></div></div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
call foo<br>
br %c, %then, %else<br>
<br>
then:<br>
load i32, %p1<br>
load i32, %p2<br>
load i32, %p3<br>
...<br>
br %end<br>
<br>
else:<br>
load i32, %p4<br>
load i32, %p5<br>
load i32, %p6<br>
...<br>
br %end<br>
<br>
end:<br>
<br>
================<br>
Comment at: lib/Transforms/Scalar/MergedLoadStoreMotion.cpp:1011<br>
@@ +1010,3 @@<br>
+ BBNode->getDFSNumOut()};<br>
+ // First see if it's outside the dominator tree range<br>
+ // The only way it could affect us is if it's below us in the dominator<br>
----------------<br>
Do you even need the domtree info here to check for barriers in a diamond? </blockquote><div><br></div></div></div></div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>We need to know if there is another store below us in the block.</div><div>The only thing we have is the uses.</div><div>We could walk the rest of the instructions, but it's usually faster to check the set of uses, because you don't know how long the block is.</div></div></div></div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> There are references to hammocks, but as far as I can tell we never actually attempt to optimize hammocks. If that is the case, couldn't this just check for barriers in the same block as Start?<br></blockquote><div><br></div></div></div></div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>The check for barriers is whether there are things in either side of the diamond, above or below the load/store (respectively), preventing us from hoisting/sinking.<br></div><div>There are two types of barriers:</div><div><br></div><div>1. The thing we are clobbered by - in the case of loads this is an O(1) check, in the case of stores, that is what he loop is doing - seeing where the things that clobber the store are.</div><div>2. unwinding. LLVM does not explicitly represent the control flow these blocks may actually have (GCC, on the other hand, would have an EH edge from these blocks), so you have to look.</div></div></div></div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<span><br>
================<br>
Comment at: lib/Transforms/Scalar/MergedLoadStoreMotion.cpp:1052<br>
@@ +1051,3 @@<br>
+ //<br>
+ // 1. There are no stores above S0 or S1 in either block.<br>
+ // In this case, no phi node is necessary, both MemoryDefs must have the same<br>
----------------<br>
</span><span>Gerolf wrote:<br>
> I continue struggling with this function. For now I just need some help understanding the comment.<br>
> When you say "above" S0, S1 you refer to DFS numbering like Sx is above S0 in this example:<br>
><br>
> 0 Sx<br>
> 1<br>
> 2 S0<br>
><br>
> Correct? This would be consistent with your DFS explanation in a previous explanation.<br>
</span>I think there is a bug in the case 1 scenario if there is a pre-existing phi in the bottom of the diamond that references just one of the sunk stores.<br>
For example:<br>
<br>
; 1 = MemDef ...<br>
call foo<br>
br %c %then, %else<br>
<br>
then:<br>
; 2 = MemDef(1)<br>
store @A<br>
; 3 = MemDef(2)<br>
store @B<br>
br %end<br>
<br>
else:<br>
; 4 = MemDef(1)<br>
store @A<br>
br %end<br>
<br>
end:<br>
5 = MemPhi(3, 4)<br>
<br>
I believe in this case, after updating the end block will look like:<br>
<br>
end:<br>
5 = MemPhi(3, 6)<br>
; 6 = MemDef(1)<br>
store @A<br>
<br></blockquote><div><br></div></div></div></div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>I believe there is a testcase that tests this. <br></div></div></div></div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Which seems wrong, since the phi is before it's use, but also it seems like having 6's defining access skip over the phi 5 and def 3 could cause trouble, though I'm less sure about the latter.<br>
<br>
<br>
<a href="http://reviews.llvm.org/D8688" rel="noreferrer" target="_blank">http://reviews.llvm.org/D8688</a><br>
<br>
<br>
<br>
</blockquote></div></div></div></blockquote></div>