[PATCH] Clarify wording in the LangRef around !invariant.load

Tue Dec 2 16:23:15 PST 2014

> On Dec 2, 2014, at 4:02 PM, Philip Reames <listmail at philipreames.com> wrote:
> 
> 
> On 12/01/2014 09:47 PM, Sanjoy Das wrote:
>>> I think this is completely legal for the current optimizer.  Here's my
>>> argument:
>>> - !invariant.load describes the constant-ness of the *pointer value*.  Once
>>> the pointer *value* is known dereferenceable, it is always invariant.
>>> - The store implies dereferenceability.
>>> - The store must (by assumption) be not changing the value of the location.
>>> (Otherwise, the invariant.load marker lied originally.)
>>> - Given the store can be completely omitted, combining the two loads becomes
>>> 'obviously' okay.
>> 
>> If I understand this correctly, the above argument allows cases like
>> this:
>> 
>>   int **p = calloc(sizeof(int*))
>>   store calloc(sizeof(int)) to p
>>   int *l = load(p)
>>   print *l // prints zero
>> 
>>  ==>
>> 
>>   int **p = calloc(sizeof(int*))
>>   store calloc(sizeof(int)) to p
>>   if (false) {
>>     int *s = load (p) !invariant
>>   }
>>   int *l = load(p)
>>   print *l // prints zero
> This step is not valid.  By introducing an !invariant.load where one did not exist, you are introducing a fact about the memory that the frontend did not provide.
> 
> In offline conversation, Sanjoy gave a clearer example.  Consider a starting point with:
> 
>  int **p = calloc(sizeof(int*))
>  store calloc(sizeof(int)) to p
>  if (false) call @undef(p)
>  int *l = load(p)
>  print *l // prints zero
> 
> The problem here is that precisely because we can't introduce the invariant load without changing semantics, we are no longer free to pick the target of the call freely.  If there's a function with an !invariant.load from it's first argument, it is not legal for us to chose that function, and then inline it.  In other words, the semantics for invariant.load as currently specified is incompatible with the semantics for undef as currently specified.
> 
> Yuck.
> 
> To deal with this, I think we *need* to move towards an !invariant marker on the pointer value itself, not on the load.  Alternately, we could do something similar to the base object + offset scheme used by the dereferenceable attribute.
> 
> Sorry for being dense on this.
> 
> I think it's fair to say that this is a semantic problem in theory, but not - to anyone's knowledge - practice yet right?
> 
>> 
>>  ==>
>> 
>>   int **p = calloc(sizeof(int*))
>>   int *s = load (p) !invariant
>>   store calloc(sizeof(int)) to p
>>   if (false) {
>>   }
>>   int *l = load(p)
>>   print *l // prints zero
>> 
>>  ==>
>> 
>>   int **p = calloc(sizeof(int*))
>>   int *s = load (p) !invariant
>>   store calloc(sizeof(int)) to p
>>   if (false) {
>>   }
>>   int *l = load(s)
>>   print *l

Everything is fine until this step. Is there any reason to believe the optimizer would ever forward the invariant load to the non-invariant load? I don’t think there’s much danger of that happening accidentally through oversight. We may wish to explicitly define the semantics of !invariant such that it is allowed (as I was hoping we could do before thinking it through), but then we’d be the ones creating the problem. I think the semantics for metadata naturally imply path-sensitivity. That would be perfectly consistent with load !range metadata. So if there are no uses of the load, the metadata is irrelevant. I realize it’s confusing and possibly not useful for anything but global constants, but it is really handy for that use case.

-Andy

>> 
>>  ==>
>> 
>>   int **p = calloc(sizeof(int*))
>>   int *s = load (p) !invariant
>>   store calloc(sizeof(int)) to p
>>   if (false) {
>>   }
>>   int *l = load(s) // UB, since it now dereferences null
>>   print *l
>> 
>> 
>> We started from a well-defined program, and ended in UB.
>> 
>> -- Sanjoy
>