[LLVMdev] Optimization hints for "constant" loads

Tue Oct 21 16:03:33 PDT 2014

On 10/21/2014 01:44 PM, Andrew Trick wrote:
>> On Oct 21, 2014, at 11:46 AM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote:
>>
>> Thank you for the explanation, I think I misunderstood your solution
>> initially.  Is it accurate to say: by making the definition of the
>> source pointer of an !invariant load control (or data) dependent on
>> some initialization event, you can effectively separate the
>> initialization event (where the location is not invariant) and the
>> rest of the execution (where the location is invariant).
>>
>> This approach looks similar to Ruby's "freeze" method:
>> http://ruby-doc.org/core-2.1.3/Object.html#method-i-freeze
>>
>> What is the aliasing relationship between a non-invariant load from
>> the result of a safe_cast and a load from (or store to) the argument
>> we pass to it?  Is it sensible to make them MustAlias?  I'm trying to
>> think if a frontend needs to be careful about using the return value
>> of check_cast only for !invariant loads (or some other restriction),
>> or is it legal (or desirable) to mix and match both kinds of loads
>> from both kinds of pointers.
> %p1 = ...
> store %v0, %p1
> %p2 = llvm.safe_cast %p1
> store %v1, %p1
> store %v2, %p2
> %v3 = load %p2, !invariant
>
> The load can be moved above both stores (%v1 and %v2). AFAIK, the invariant metadata tag essentially tells alias analysis not to even try, just assume no aliasing. So the semantics effectively guarantee that %v0==%v1==%v3==%v3. The IR is still “legal” in that it is verifiable, but it is nonsense.
>
> Note that it is also totally valid for an alias analysis to return that they MustAlias. That would allow forwarding from store %v0 to the load. This would have to be implemented as a special case, since the invariant tag normally bypasses pointer analysis. I admit this is weird, but I’m don’t know of any problems it will cause. This can already happen with TBAA, and will be a problem with any metadata that can be abused by the frontend or language design. A difficultly with this approach is that the frontend has the burden for emitting guards anywhere the “invariant” data could be modified.
>
> How do you imagine mixing and matching invariant and non-invariant? I showed some examples of doing that safely, but I’m not sure how it’s relevant to you.
>
> -Andy
Ok, I'm catching up on this thread.  Let me summarize a couple of things 
I noticed while reading.

Sanjoy made a good point.  We don't actually need a new variant of 
"invariant.start".  Simply using an invariant.start with no uses gives 
us a notion of an invariant region with no end.  (Since the result 
doesn't escape, there can be no end hidden inside a function call.)   
This seems like a simple notion to exploit and is strict step forward 
from where we are, without a new intrinsic.

The notions of "invariant over some range" and "safe to speculate over 
some range" are related, but distinct.  However, to get the most power 
out of the former, you often need the later.

Andy seems to be arguing in favour of data flow mostly because it's 
easier for the optimizer.  Unless I'm misreading, there's nothing 
preventing us from using control flow for the same patterns.  For example:
p1 = ...
*p1 = 5
p2 = llvm.safe_cast p1
p3 = select is_constant p2, p1
load p3

Is isomorphic with:
p1 = ...
*p1 = 5
if is_constant:
   invariant.start(p1)
load p1

Andy, is this correct?  Or am I missing something?

(p.s. You guys seem to be throwing around the terms "!invariant" and 
"invariant load" without clearly defining them.  I'm unsure if you mean 
the same things.)

I suspect this is going to be a topic we should discuss in person at the 
developers conference or over a beer at a social.  It's a bit 
complicated to keep track of in email.

My belief is that a path sensitive invariant indicator for a memory 
location is probably sufficient for most use cases.  In particular, the 
initialization and use logic is usually not in the same compilation for 
the cases I care about.  I care that such cases are *possible*, but I'm 
less concerned about them being *easy*.  I'm also having real trouble 
find examples of cases where Andy's "conditionally invariant data" is 
useful.  Given that we *have an existing* mechanism which only needs a 
slightly tweak for "endless" invariant regions, I'm tempted to stick 
with what we have.

The only change (clarification?) I'd make to the current scheme is to 
specify that a location is both invariant *and dereferenceable* at the 
point it is listed in an invariant.start.   This enables speculative 
loading up to that construct.  I think this is actually what the current 
intrinsic is trying for from the documentation.

(Hm.. it does make it hard to rearrange calls to invariant.start 
though...  Thoughts?)

I suspect that for the Array.length case, I'd probably do something like 
this:
arr = function_that_returns_array();
invariant.start(arr.length)
... arbitrary control flow ...
length = arr.length
... arbitrary control flow ...
length2 = arr.length

My reasoning here is that if something returns an array (in this 
compilation) I want the invariant fact to be visible.  If that factory 
function gets inlined, I'd end up with:
arr = new Array;
arr.length = X;
invariant.start(arr.length)
... arbitrary control flow ...
length = arr.length
... arbitrary control flow ...
length2 = arr.length

Both cases seem nearly ideal here.  I know I can lift the loads as far 
as the function which returns the base pointer.  I know I can value 
forward loads (either with or without lifting).  I know I can value 
forward the store to the load (as I normally would).

In terms of aliasing, it seems like having invariant.start read from 
it's arguments memory is sufficient.  That prevents reorderings of 
stores to that location before the address to after the 
invariant.start.  Due to conservative aliasing, that might be somewhat 
harmful, but I don't really see a way to avoid this...  (Or is this the 
point Andy's trying to solve with the data flow?)  I don't have a sense 
for how painful this is.  Try it and see?

I could see a useful extension for a function attribute (type?, TBAA?) 
for describing the return value's invariant fields (*at the point of the 
return*), but that is more minor goodness for IR size then clearly 
necessary.

Philip