[LLVMdev] Optimization hints for "constant" loads

Tue Dec 2 09:55:53 PST 2014

> On Dec 1, 2014, at 3:22 PM, Philip Reames <listmail at philipreames.com> wrote:
> 
> 
> On 12/01/2014 02:42 PM, Andrew Trick wrote:
>> 
>>> On Dec 1, 2014, at 2:21 PM, Philip Reames <listmail at philipreames.com <mailto:listmail at philipreames.com>> wrote:
>>> 
>>> 
>>> On 12/01/2014 11:14 AM, Andrew Trick wrote:
>>>> 
>>>>> On Oct 21, 2014, at 4:03 PM, Philip Reames <listmail at philipreames.com <mailto:listmail at philipreames.com>> wrote:
>>>>> 
>>>>> Sanjoy made a good point.  We don't actually need a new variant of "invariant.start".  Simply using an invariant.start with no uses gives us a notion of an invariant region with no end.  (Since the result doesn't escape, there can be no end hidden inside a function call.)   This seems like a simple notion to exploit and is strict step forward from where we are, without a new intrinsic.
>>>> 
>>>> I'm coming back to this because it is a sticking point for me in all of the proposals that were floated informally on the commits list. I would really like this to work, but can't prove that it's safe.
>>> As I said in the other thread, !invariant.load and llvm.invariant.* are not the same.  This thread is discussing the later, not the former.  The other thread is discussing the former, but not the later.  
>>>> 
>>>> Given:
>>>> 
>>>> %a = newArray()
>>>> %pLen = gep(%a, <length offset>)
>>>> invariant.start(<size>, %pLen)
>>>> %len = load %pLen
>>>> 
>>>> %o = foo() // may deallocate %a
>>>> store %something
>>>> load %o
>>>> 
>>>> I want to claim that the LLVM optimizer cannot do the following (but don't have a complete line of reasoning yet):
>>>> 
>>>> %a = newArray()
>>>> %pLen = gep(%a, <length offset>)
>>>> invariant.start(<size>, %pLen)
>>>> %len = load %pLen
>>>> 
>>>> %o = foo() // may deallocate %a
>>>> if ptrtoint(%o) == ptrtoint(%pLen)
>>>>   load %pLen
>>>>   store %something // Oops... this might alias
>>>> else
>>>>   store %something
>>>>   load %o
>>> Just to make sure we're on the same page, it *would* be legal for the optimizer to construct:
>>> %a = newArray()
>>> %pLen = gep(%a, <length offset>)
>>> invariant.start(<size>, %pLen)
>>> %len = load %pLen
>>> 
>>> %o = foo() // may deallocate %a
>>> if ptrtoint(%o) == ptrtoint(%pLen)
>>>   store %something // Oops... this might alias
>>>   load %pLen <--- This is now after the store
>>> else
>>>   store %something
>>>   load %o
>>> 
>>> Are we in agreement here?  
>> 
>> It’s fine in the sense that memory access has not been reordered… yet.
> I think I've actually convinced myself my example is not okay, but for a completely different reason.  :)  See my last comment below.
>> 
>> But is there an implicit invariant.end(%pLen) at the call to foo(), at which point the array can be freed and another object allocated in its place? I don’t think so.
> Given the current semantics, if the result of the invariant.start is not passed to foo, we can assume foo does not contain an invariant.end.  However, for foo to allocate a new object in the same memory, it must write to that memory.  Doing so without ending the previous invariant region is ill defined.  As such, your example isn't a valid test.  
> 
> We have two cases:
> a) The invariant region hasn't ended.  As a result, the input IR is ill defined and the problematic reordering can happen.  
> b) The invariant region ended (and we either encountered a llvm.invariant.end, or saw the result of invariant.start passed to foo). In this case, the location is no longer invariant and the reordering isn't legal and wouldn't happen.
> 
> As I think came up before, this would imply that an invariant.end can't be dropped without dropping it's corresponding invariant.start.  That's problematic.  
> 
>> The way I interpreted the conclusion of this email thread is that invariant.start can be used without invariant.end.
> We had a proposal for this, but I don't believe it's actually been implemented yet.  I believe this proposal is still consistent with everything I said above though.  
>> Since the token returned by the intrinsic never escapes, you can just assume the memory is invariant at all uses. The problem here is that the optimizer could (in theory) introduce new uses of the pointer after the object lifetime. This is the same problem that you yourself have raised. I hoped we could sidestep the problem because it crops up with any representation that attaches invariance to a pointer value. If we have an answer for this, then we can easily debate other representations like broadening !invariant.load metadata semantics and introducing new intrinsics.
> I agree there is a general issue here.  However, I don't actually think it's an issue of invariance.  Instead, I see this as being a problem of *derefenceability*.  Regardless of the 'invariant-ness' of the memory in question, it's problematic for the optimizer to insert uses after a deallocation because the memory transitions from dereferenceable to non-dereferenceable.  (i.e. Consider a malloc routine allocates one page per object, and a free which protects the page freed.)  Any transformation which inserts such a load is dangerous.  
> 
> What your example really shows is that we can have two names for the same *memory address*.  One of those names can be dereferenceable     while the other is not.  
> 
> The dynamic check of the pointer values essentially allows the optimizer to chose which of the two names for the same physical address it wants to use.  I think that is valid, but it's odd to say the least.  The particular example you've shown isn't legal since the optimizer is choosing to insert a load to a non-dereferenceable location and thus could introduce a fault.

Sorry, I didn’t follow your response. I was showing that the optimizer can theoretically substitute the use of one pointer value (%o) with another pointer value (%pLen) after checking value equivalence. Doing that creates multiple uses of the same pointer value where not all uses dereference the same memory object. To me, that means we cannot associate a pointer value/address with an object-specific attribute, like invariance, without tying it to a closed lifetime. I wish we could prevent the optimizer from doing this.

-Andy

>   
>> 
>> -Andy
>> 
>>> 
>>> What is the argument that you see leading from the second to your problematic example?  I'm missing the reasoning by which you got there.  I agree that the example is problematic, I'm just not sure how it would arise.  
>>> 
>>> 
>>>> 
>>>> Either we need to make a convincing argument, or bail on most of the tentative proposals for expressing invariance, and resort to generating IR like this:
>>>> 
>>>> %a = newArray()
>>>> %pLen = gep(%a, <length offset>)
>>>> %t = invariant.start(<size>, %pLen)
>>>> %len = load %pLen
>>>> 
>>>> %o = foo() // may deallocate %a
>>>> invariant.end(%t, <size>, %pLen)
>>>> store %something
>>>> load %o
>>>> 
>>>> ...which is pretty painful to implement and may be very difficult to optimize.
>>>> 
>>>> -Andy
>>> 
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141202/fe9fc5a2/attachment.html>