[PATCH] Clarify wording in the LangRef around !invariant.load

Tue Dec 2 16:02:54 PST 2014

On 12/01/2014 09:47 PM, Sanjoy Das wrote:
>> I think this is completely legal for the current optimizer.  Here's my
>> argument:
>> - !invariant.load describes the constant-ness of the *pointer value*.  Once
>> the pointer *value* is known dereferenceable, it is always invariant.
>> - The store implies dereferenceability.
>> - The store must (by assumption) be not changing the value of the location.
>> (Otherwise, the invariant.load marker lied originally.)
>> - Given the store can be completely omitted, combining the two loads becomes
>> 'obviously' okay.
>
> If I understand this correctly, the above argument allows cases like
> this:
>
>    int **p = calloc(sizeof(int*))
>    store calloc(sizeof(int)) to p
>    int *l = load(p)
>    print *l // prints zero
>
>   ==>
>
>    int **p = calloc(sizeof(int*))
>    store calloc(sizeof(int)) to p
>    if (false) {
>      int *s = load (p) !invariant
>    }
>    int *l = load(p)
>    print *l // prints zero
This step is not valid.  By introducing an !invariant.load where one did 
not exist, you are introducing a fact about the memory that the frontend 
did not provide.

In offline conversation, Sanjoy gave a clearer example.  Consider a 
starting point with:

   int **p = calloc(sizeof(int*))
   store calloc(sizeof(int)) to p
   if (false) call @undef(p)
   int *l = load(p)
   print *l // prints zero

The problem here is that precisely because we can't introduce the 
invariant load without changing semantics, we are no longer free to pick 
the target of the call freely.  If there's a function with an 
!invariant.load from it's first argument, it is not legal for us to 
chose that function, and then inline it.  In other words, the semantics 
for invariant.load as currently specified is incompatible with the 
semantics for undef as currently specified.

Yuck.

To deal with this, I think we *need* to move towards an !invariant 
marker on the pointer value itself, not on the load.  Alternately, we 
could do something similar to the base object + offset scheme used by 
the dereferenceable attribute.

Sorry for being dense on this.

I think it's fair to say that this is a semantic problem in theory, but 
not - to anyone's knowledge - practice yet right?

>
>   ==>
>
>    int **p = calloc(sizeof(int*))
>    int *s = load (p) !invariant
>    store calloc(sizeof(int)) to p
>    if (false) {
>    }
>    int *l = load(p)
>    print *l // prints zero
>
>   ==>
>
>    int **p = calloc(sizeof(int*))
>    int *s = load (p) !invariant
>    store calloc(sizeof(int)) to p
>    if (false) {
>    }
>    int *l = load(s)
>    print *l
>
>   ==>
>
>    int **p = calloc(sizeof(int*))
>    int *s = load (p) !invariant
>    store calloc(sizeof(int)) to p
>    if (false) {
>    }
>    int *l = load(s) // UB, since it now dereferences null
>    print *l
>
>
> We started from a well-defined program, and ended in UB.
>
> -- Sanjoy