[PATCH] D20116: Add speculatable function attribute

Mon Mar 27 21:52:04 PDT 2017

sanjoy added a comment.

In https://reviews.llvm.org/D20116#711778, @dberlin wrote:

> In https://reviews.llvm.org/D20116#711696, @sanjoy wrote:
>
> > In https://reviews.llvm.org/D20116#710937, @dberlin wrote:
> >
> > > > I think that all of this is right, you can't apply some of these optimizations to call sites with the speculatable attribute.
> > >
> > > A lot of these (but not all of these) amount to "you cannot clone speculatable", ie if you clone the call, you must remove the attribute.
> > >
> > > I believe the set of conditions under which you could clone the attribute are:
> > >
> > > 1. The new call is CDEQ the original call (IE the set of conditions under which it executes is identical).  IF you are cloning from one function to another, it must be CDEQ using the interprocedural control dependence.
> > > 2. The arguments are identical.
> > > 3. The function called is identical or marked speculatable
> > >
> > >   Note this is not entirely shocking,.
> >
> >
> > (Caveat: I think we're arguing some subtle points here, so apologies if I misunderstood your intent.)
> >
> > I think (1) //is// somewhat shocking.  I'd say normally we follow a weaker constraint:  "The new call is executed only if the old call was executed", not "The new call is executed if and only [edit: previously this incorrectly said "only if and only if"] if the old call was executed".
> >
> > For instance, if we unroll loops (say by a factor of 2):
> >
> >   for (i = 0; i < N; i++)
> >     f(i) readonly;      // X
> >
> >
> > it is reasonable that the result be:
> >
> >   for (i = 0; i < N / 2; i += 2) {
> >     f(i) readonly;       // A
> >     f(i + 1) readonly;   // B
> >   }
>
>
> I'm presuming you meant this loop  to go half as much but still cover the same values in the same order in calls to f?
>  (it doesn't)

Do'h.  I did a split-brain there. :)

> If so, agree we allow it in practice, butt hat's in practice
>  Here is an, IMHO,  valid callsite attribute readonly implementation of f, not legally doable in C, but we're talking about llvm IR anyway.
> 
>   void f(int a)
>   {
>     // go scrobbling through instruction stream to see what the increment is
>     if (increment == 2)
>       write some memory
>     otherwise
>       do nothing
>   }
> 
> 
> Now, obviously, i'm cheating, but the point is, f can read whatever state it wants here, and detect what you've done, and not be readonly in that case.

I don't think we can reasonably allow functions to base their behavior on the instruction stream of their callers (or anywhere else, for that matter), in IR or in C.

For instance, if we allowed things like that, even basic optimizations like this:

  void f() {
    int a = 2, b = 3;
    f(a + b);
  }

to

  void f() {
    f(5);
  }

would not be valid, since an implementation of `f` could be:

  void f() {
    cond = does the caller's instruction stream have an add instruction?
    if (cond)
      print("hi");
  }

and we'd change observable behavior by optimizing out the add instruction.

> Again, if i'm understanding things right (the langref really is somewhat confusing to me here, but i assume it's legal to have not be readonly except for that call. That's what it seems like we are talking about here)

I think code like this is fine:

  void f(bool do_store, int* ptr) {
    if (do_store)
      *ptr = 42;
  }

  ...
  f(false, ptr) readnone
  ...

The example you gave above seems odd to me because the callee is has different behavior based on the instruction stream of the caller, but the conditionally-readonly aspect of it is fine.

> Given they can also unwind, and read state of  other functions, i'm pretty positive you can come up with implementations easier than what i just did to detect and write memory in the two cases differently, without resorting to this trickery.

If there was a legal way to detect a difference between the two cases above, then loop unrolling would be illegal.  So even if there is some way to detect the difference, I'd consider that a bug in the LLVM IR semantics since it disallows an important optimization.

> If you meant to make the loop iteration space change, i disagree with you that it's allowed even harder :)

Yes, we can't transform the iteration space, since if instead of executing `f(0)` first, we execute `f(N - 1)` first; we'd break an `f` that was implemented as:

  void f(int i) {
    // I don't think this is possible to do without writing memory in C++,
    // but it may be possible in other languages.
    throw i;
  }

in which case the pre-transformed program would throw `0`, but the post-transform program would throw `N - 1`.

>> Assuming I correctly understood what you meant by "the set of conditions under which it executes is identical", we won't be able to keep any of the `readonly` attributes.
>> 
>> - `X` is executed for all `i` < `N`.
>> - `A` is executed for all even `i` if `N` > `1`
>> - `B` is executed for all odd `i` if `N` > `1`
>> - `C` is executed if `N` is `1`.
>> 
>>   Of course, in the program //trace// the instances in which `f` was executed stay the same in the pre-unroll and post-unroll program; and that's related to what I was trying to say earlier:  with the `speculatable` attribute, the behavior of a program is no longer a property of its trace -- there could be a "hidden" speculatable call somewhere that influences the behavior of the program without actually being executed (in the initial program), by getting hoisted into the executable bits of the program.  In order to mark these programs as "buggy", we will have to rule such "hidden" speculatable calls (that nevertheless have side effects) as tainting the program with UB, despite not being present in the program trace.
> 
> Again, this is literally what it means to play with control dependence, so i *don't* find it shocking.

I'm not entirely sure what you mean here -- did you mean that it is okay to have stuff outside the program trace affect program definedness?  Or did you mean the converse?

>>> the "no cloning" is true of other attributes (you can't clone and apply readonly like is done in the devirt example above)
>> 
>> Why can't you apply `readonly` to the devirtualization example?
> 
> Sorry, meant to delete it. you can because it's dead code, otherwise, you can't.
> 
>>> it's just that, being an attribute about control dependence, the effects relate to control dependence.
>> 
>> Yes -- IMO that's the key problem with speculatable.  Since it says "there is no control dependence", we cannot apply it in a control dependent manner.
> 
> Sure, i'd agree with that, but you really cannot work around that.

Yes!

https://reviews.llvm.org/D20116