[LLVMdev] Optimization hints for "constant" loads

Mon Dec 1 15:22:35 PST 2014

On 12/01/2014 02:42 PM, Andrew Trick wrote:
>
>> On Dec 1, 2014, at 2:21 PM, Philip Reames <listmail at philipreames.com 
>> <mailto:listmail at philipreames.com>> wrote:
>>
>>
>> On 12/01/2014 11:14 AM, Andrew Trick wrote:
>>>
>>>> On Oct 21, 2014, at 4:03 PM, Philip Reames 
>>>> <listmail at philipreames.com <mailto:listmail at philipreames.com>> wrote:
>>>>
>>>> Sanjoy made a good point.  We don't actually need a new variant of 
>>>> "invariant.start".  Simply using an invariant.start with no uses 
>>>> gives us a notion of an invariant region with no end.  (Since the 
>>>> result doesn't escape, there can be no end hidden inside a function 
>>>> call.)   This seems like a simple notion to exploit and is strict 
>>>> step forward from where we are, without a new intrinsic.
>>>
>>> I'm coming back to this because it is a sticking point for me in all 
>>> of the proposals that were floated informally on the commits list. I 
>>> would really like this to work, but can't prove that it's safe.
>> As I said in the other thread, !invariant.load and llvm.invariant.* 
>> are not the same.  This thread is discussing the later, not the 
>> former.  The other thread is discussing the former, but not the later.
>>>
>>> Given:
>>>
>>> %a = newArray()
>>> %pLen = gep(%a, <length offset>)
>>> invariant.start(<size>, %pLen)
>>> %len = load %pLen
>>>
>>> %o = foo() // may deallocate %a
>>> store %something
>>> load %o
>>>
>>> I want to claim that the LLVM optimizer cannot do the following (but 
>>> don't have a complete line of reasoning yet):
>>>
>>> %a = newArray()
>>> %pLen = gep(%a, <length offset>)
>>> invariant.start(<size>, %pLen)
>>> %len = load %pLen
>>>
>>> %o = foo() // may deallocate %a
>>> if ptrtoint(%o) == ptrtoint(%pLen)
>>>   load %pLen
>>>   store %something // Oops... this might alias
>>> else
>>>   store %something
>>>   load %o
>> Just to make sure we're on the same page, it *would* be legal for the 
>> optimizer to construct:
>> %a = newArray()
>> %pLen = gep(%a, <length offset>)
>> invariant.start(<size>, %pLen)
>> %len = load %pLen
>>
>> %o = foo() // may deallocate %a
>> if ptrtoint(%o) == ptrtoint(%pLen)
>>   store %something // Oops... this might alias
>>   load %pLen <--- This is now after the store
>> else
>>   store %something
>>   load %o
>>
>> Are we in agreement here?
>
> It’s fine in the sense that memory access has not been reordered… yet.
I think I've actually convinced myself my example is not okay, but for a 
completely different reason.  :)  See my last comment below.
>
> But is there an implicit invariant.end(%pLen) at the call to foo(), at 
> which point the array can be freed and another object allocated in its 
> place? I don’t think so.
Given the current semantics, if the result of the invariant.start is not 
passed to foo, we can assume foo does not contain an invariant.end.  
However, for foo to allocate a new object in the same memory, it must 
write to that memory.  Doing so without ending the previous invariant 
region is ill defined.  As such, your example isn't a valid test.

We have two cases:
a) The invariant region hasn't ended.  As a result, the input IR is ill 
defined and the problematic reordering can happen.
b) The invariant region ended (and we either encountered a 
llvm.invariant.end, or saw the result of invariant.start passed to foo). 
In this case, the location is no longer invariant and the reordering 
isn't legal and wouldn't happen.

As I think came up before, this would imply that an invariant.end can't 
be dropped without dropping it's corresponding invariant.start.  That's 
problematic.

> The way I interpreted the conclusion of this email thread is that 
> invariant.start can be used without invariant.end.
We had a proposal for this, but I don't believe it's actually been 
implemented yet.  I believe this proposal is still consistent with 
everything I said above though.
> Since the token returned by the intrinsic never escapes, you can just 
> assume the memory is invariant at all uses. The problem here is that 
> the optimizer could (in theory) introduce new uses of the pointer 
> after the object lifetime. This is the same problem that you yourself 
> have raised. I hoped we could sidestep the problem because it crops up 
> with any representation that attaches invariance to a pointer value. 
> If we have an answer for this, then we can easily debate other 
> representations like broadening !invariant.load metadata semantics and 
> introducing new intrinsics.
I agree there is a general issue here.  However, I don't actually think 
it's an issue of invariance.  Instead, I see this as being a problem of 
*derefenceability*.  Regardless of the 'invariant-ness' of the memory in 
question, it's problematic for the optimizer to insert uses after a 
deallocation because the memory transitions from dereferenceable to 
non-dereferenceable.  (i.e. Consider a malloc routine allocates one page 
per object, and a free which protects the page freed.)  Any 
transformation which inserts such a load is dangerous.

What your example really shows is that we can have two names for the 
same *memory address*.  One of those names can be dereferenceable while 
the other is not.

The dynamic check of the pointer values essentially allows the optimizer 
to chose which of the two names for the same physical address it wants 
to use.  I think that is valid, but it's odd to say the least.  The 
particular example you've shown isn't legal since the optimizer is 
choosing to insert a load to a non-dereferenceable location and thus 
could introduce a fault.
>
> -Andy
>
>>
>> What is the argument that you see leading from the second to your 
>> problematic example?  I'm missing the reasoning by which you got 
>> there.  I agree that the example is problematic, I'm just not sure 
>> how it would arise.
>>
>>
>>>
>>> Either we need to make a convincing argument, or bail on most of the 
>>> tentative proposals for expressing invariance, and resort to 
>>> generating IR like this:
>>>
>>> %a = newArray()
>>> %pLen = gep(%a, <length offset>)
>>> %t = invariant.start(<size>, %pLen)
>>> %len = load %pLen
>>>
>>> %o = foo() // may deallocate %a
>>> invariant.end(%t, <size>, %pLen)
>>> store %something
>>> load %o
>>>
>>> ...which is pretty painful to implement and may be very difficult to 
>>> optimize.
>>>
>>> -Andy
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141201/03ec5c74/attachment.html>