<br><br><div class="gmail_quote">On Tue Dec 02 2014 at 10:03:23 AM Andrew Trick <<a href="mailto:atrick@apple.com">atrick@apple.com</a>> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div><blockquote type="cite"><div>On Dec 1, 2014, at 3:22 PM, Philip Reames <<a href="mailto:listmail@philipreames.com" target="_blank">listmail@philipreames.com</a>> wrote:</div><br><div>
<div bgcolor="#FFFFFF" text="#000000">
<br>
<div>On 12/01/2014 02:42 PM, Andrew Trick
wrote:<br>
</div>
<blockquote type="cite">
<br>
<div>
<blockquote type="cite">
<div>On Dec 1, 2014, at 2:21 PM, Philip Reames <<a href="mailto:listmail@philipreames.com" target="_blank">listmail@philipreames.com</a>>
wrote:</div>
<br>
<div>
<div bgcolor="#FFFFFF" text="#000000"> <br>
<div>On 12/01/2014 11:14 AM,
Andrew Trick wrote:<br>
</div>
<blockquote type="cite">
<br>
<div>
<blockquote type="cite">
<div>On Oct 21, 2014, at 4:03 PM, Philip
Reames <<a href="mailto:listmail@philipreames.com" target="_blank">listmail@philipreames.com</a>>
wrote:</div>
<br>
<div><span style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;float:none;display:inline!important">Sanjoy
made a good point. We don't actually need a new
variant of "invariant.start". Simply using an
invariant.start with no uses gives us a notion
of an invariant region with no end. (Since the
result doesn't escape, there can be no end
hidden inside a function call.) This seems
like a simple notion to exploit and is strict
step forward from where we are, without a new
intrinsic.</span></div>
</blockquote>
</div>
<br>
<div>
<div>I'm coming back to this because it is a
sticking point for me in all of the proposals that
were floated informally on the commits list. I would
really like this to work, but can't prove that it's
safe.</div>
</div>
</blockquote>
As I said in the other thread, !invariant.load and
llvm.invariant.* are not the same. This thread is
discussing the later, not the former. The other thread is
discussing the former, but not the later. <br>
<blockquote type="cite">
<div>
<div><br>
</div>
<div>Given:</div>
<div><br>
</div>
<div>%a = newArray()</div>
<div>%pLen = gep(%a, <length offset>)</div>
<div>invariant.start(<size>, %pLen)</div>
<div>%len = load %pLen</div>
<div><br>
</div>
<div>%o = foo() // may deallocate %a</div>
<div>store %something</div>
<div>load %o</div>
<div><br>
</div>
<div>I want to claim that the LLVM optimizer
cannot do the following (but don't have a complete
line of reasoning yet):</div>
<div><br>
</div>
<div>%a = newArray()</div>
<div>%pLen = gep(%a, <length offset>)</div>
<div>invariant.start(<size>, %pLen)</div>
<div>%len = load %pLen</div>
<div><br>
</div>
<div>%o = foo() // may deallocate %a</div>
<div>if ptrtoint(%o) == ptrtoint(%pLen)</div>
<div> load %pLen</div>
<div> store %something // Oops... this might
alias</div>
<div>else</div>
<div> store %something</div>
<div> load %o</div>
</div>
</blockquote>
Just to make sure we're on the same page, it *would* be
legal for the optimizer to construct:<br>
<div>%a = newArray()</div>
<div>%pLen = gep(%a, <length offset>)</div>
<div>invariant.start(<size>, %pLen)</div>
<div>%len = load %pLen</div>
<div><br>
</div>
<div>%o = foo() // may deallocate %a</div>
<div>if ptrtoint(%o) == ptrtoint(%pLen)</div>
<div> store %something // Oops... this might
alias<br>
load %pLen <--- This is now after the store<br>
</div>
<div>else</div>
<div> store %something</div>
<div> load %o<br>
<br>
Are we in agreement here? <br>
</div>
</div>
</div>
</blockquote>
<div><br>
</div>
It’s fine in the sense that memory access has not been
reordered… yet.</div>
</blockquote>
I think I've actually convinced myself my example is not okay, but
for a completely different reason. :) See my last comment below.<br>
<blockquote type="cite">
<div><br>
</div>
<div>But is there an implicit invariant.end(%pLen) at the call to
foo(), at which point the array can be freed and another object
allocated in its place? I don’t think so. </div>
</blockquote>
Given the current semantics, if the result of the invariant.start is
not passed to foo, we can assume foo does not contain an
invariant.end. However, for foo to allocate a new object in the
same memory, it must write to that memory. Doing so without ending
the previous invariant region is ill defined. As such, your example
isn't a valid test. <br>
<br>
We have two cases:<br>
a) The invariant region hasn't ended. As a result, the input IR is
ill defined and the problematic reordering can happen. <br>
b) The invariant region ended (and we either encountered a
llvm.invariant.end, or saw the result of invariant.start passed to
foo). In this case, the location is no longer invariant and the
reordering isn't legal and wouldn't happen.<br>
<br>
As I think came up before, this would imply that an invariant.end
can't be dropped without dropping it's corresponding
invariant.start. That's problematic. <br>
<br>
<blockquote type="cite">
<div>The way I interpreted the conclusion of this email thread is
that invariant.start can be used without invariant.end. </div>
</blockquote>
We had a proposal for this, but I don't believe it's actually been
implemented yet. I believe this proposal is still consistent with
everything I said above though. <br>
<blockquote type="cite">
<div>Since the token returned by the intrinsic never escapes, you
can just assume the memory is invariant at all uses. The problem
here is that the optimizer could (in theory) introduce new uses
of the pointer after the object lifetime. This is the same
problem that you yourself have raised. I hoped we could sidestep
the problem because it crops up with any representation that
attaches invariance to a pointer value. If we have an answer for
this, then we can easily debate other representations like
broadening !invariant.load metadata semantics and introducing
new intrinsics.</div>
</blockquote>
I agree there is a general issue here. However, I don't actually
think it's an issue of invariance. Instead, I see this as being a
problem of *derefenceability*. Regardless of the 'invariant-ness'
of the memory in question, it's problematic for the optimizer to
insert uses after a deallocation because the memory transitions from
dereferenceable to non-dereferenceable. (i.e. Consider a malloc
routine allocates one page per object, and a free which protects the
page freed.) Any transformation which inserts such a load is
dangerous. <br>
<br>
What your example really shows is that we can have two names for the
same *memory address*. One of those names can be dereferenceable
while the other is not. <br>
<br>
The dynamic check of the pointer values essentially allows the
optimizer to chose which of the two names for the same physical
address it wants to use. I think that is valid, but it's odd to say
the least. The particular example you've shown isn't legal since
the optimizer is choosing to insert a load to a non-dereferenceable
location and thus could introduce a fault.</div></div></blockquote><div><br></div></div></div><div style="word-wrap:break-word"><div>Sorry, I didn’t follow your response. I was showing that the optimizer can theoretically substitute the use of one pointer value (%o) with another pointer value (%pLen) after checking value equivalence. Doing that creates multiple uses of the same pointer value where not all uses dereference the same memory object. To me, that means we cannot associate a pointer value/address with an object-specific attribute, like invariance, without tying it to a closed lifetime.</div></div></blockquote><div><br></div><div><br></div><div>Agreed.</div><div><br></div><div><br>This is actually one of the nice things memory ssa provides/provided GCC. You got versioning of the global memory state for free, and lifetime info embedded directly into the IR in the normal SSA way.</div><div><br></div><div>For a given instruction, the state of the virtual uses for the instruction was sufficient to identify whether the memory state was the same, so value substitution was easy.</div><div><br></div><div>You could also do the inverse, and prove a better result than memory ssa already had given you - that the virtual uses were really the same due to non-aliasing.</div><div><br></div><div>This made value numbering loads/stores and substitution very easy.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div>I wish we could prevent the optimizer from doing this.</div><div><br></div><div>-Andy</div></div><div style="word-wrap:break-word"><div><br></div></div>
</blockquote></div>