[PATCH] Clarify wording in the LangRef around !invariant.load
Philip Reames
listmail at philipreames.com
Tue Dec 2 16:02:54 PST 2014
On 12/01/2014 09:47 PM, Sanjoy Das wrote:
>> I think this is completely legal for the current optimizer. Here's my
>> argument:
>> - !invariant.load describes the constant-ness of the *pointer value*. Once
>> the pointer *value* is known dereferenceable, it is always invariant.
>> - The store implies dereferenceability.
>> - The store must (by assumption) be not changing the value of the location.
>> (Otherwise, the invariant.load marker lied originally.)
>> - Given the store can be completely omitted, combining the two loads becomes
>> 'obviously' okay.
>
> If I understand this correctly, the above argument allows cases like
> this:
>
> int **p = calloc(sizeof(int*))
> store calloc(sizeof(int)) to p
> int *l = load(p)
> print *l // prints zero
>
> ==>
>
> int **p = calloc(sizeof(int*))
> store calloc(sizeof(int)) to p
> if (false) {
> int *s = load (p) !invariant
> }
> int *l = load(p)
> print *l // prints zero
This step is not valid. By introducing an !invariant.load where one did
not exist, you are introducing a fact about the memory that the frontend
did not provide.
In offline conversation, Sanjoy gave a clearer example. Consider a
starting point with:
int **p = calloc(sizeof(int*))
store calloc(sizeof(int)) to p
if (false) call @undef(p)
int *l = load(p)
print *l // prints zero
The problem here is that precisely because we can't introduce the
invariant load without changing semantics, we are no longer free to pick
the target of the call freely. If there's a function with an
!invariant.load from it's first argument, it is not legal for us to
chose that function, and then inline it. In other words, the semantics
for invariant.load as currently specified is incompatible with the
semantics for undef as currently specified.
Yuck.
To deal with this, I think we *need* to move towards an !invariant
marker on the pointer value itself, not on the load. Alternately, we
could do something similar to the base object + offset scheme used by
the dereferenceable attribute.
Sorry for being dense on this.
I think it's fair to say that this is a semantic problem in theory, but
not - to anyone's knowledge - practice yet right?
>
> ==>
>
> int **p = calloc(sizeof(int*))
> int *s = load (p) !invariant
> store calloc(sizeof(int)) to p
> if (false) {
> }
> int *l = load(p)
> print *l // prints zero
>
> ==>
>
> int **p = calloc(sizeof(int*))
> int *s = load (p) !invariant
> store calloc(sizeof(int)) to p
> if (false) {
> }
> int *l = load(s)
> print *l
>
> ==>
>
> int **p = calloc(sizeof(int*))
> int *s = load (p) !invariant
> store calloc(sizeof(int)) to p
> if (false) {
> }
> int *l = load(s) // UB, since it now dereferences null
> print *l
>
>
> We started from a well-defined program, and ended in UB.
>
> -- Sanjoy
More information about the llvm-commits
mailing list