[PATCH] D20116: Add speculatable function attribute

Mon Mar 27 19:05:16 PDT 2017

dberlin added a comment.

In https://reviews.llvm.org/D20116#711696, @sanjoy wrote:

> In https://reviews.llvm.org/D20116#710937, @dberlin wrote:
>
> > > I think that all of this is right, you can't apply some of these optimizations to call sites with the speculatable attribute.
> >
> > A lot of these (but not all of these) amount to "you cannot clone speculatable", ie if you clone the call, you must remove the attribute.
> >
> > I believe the set of conditions under which you could clone the attribute are:
> >
> > 1. The new call is CDEQ the original call (IE the set of conditions under which it executes is identical).  IF you are cloning from one function to another, it must be CDEQ using the interprocedural control dependence.
> > 2. The arguments are identical.
> > 3. The function called is identical or marked speculatable
> >
> >   Note this is not entirely shocking,.
>
>
> (Caveat: I think we're arguing some subtle points here, so apologies if I misunderstood your intent.)
>
> I think (1) //is// somewhat shocking.  I'd say normally we follow a weaker constraint:  "The new call is executed only if the old call was executed", not "The new call is executed if and only [edit: previously this incorrectly said "only if and only if"] if the old call was executed".
>
> For instance, if we unroll loops (say by a factor of 2):
>
>   for (i = 0; i < N; i++)
>     f(i) readonly;      // X
>
>
> it is reasonable that the result be:
>
>   for (i = 0; i < N / 2; i += 2) {
>     f(i) readonly;       // A
>     f(i + 1) readonly;   // B
>   }

I'm presuming you meant this loop  to go half as much but still cover the same values in the same order in calls to f?
(it doesn't)

If so, agree we allow it in practice, butt hat's in practice
Here is an, IMHO,  valid callsite attribute readonly implementation of f, not legally doable in C, but we're talking about llvm IR anyway.

  void f(int a)
  {
    // go scrobbling through instruction stream to see what the increment is
    if (increment == 2)
      write some memory
    otherwise
      do nothing
  }

Now, obviously, i'm cheating, but the point is, f can read whatever state it wants here, and detect what you've done, and not be readonly in that case.
Again, if i'm understanding things right (the langref really is somewhat confusing to me here, but i assume it's legal to have not be readonly except for that call. That's what it seems like we are talking about here)

Given they can also unwind, and read state of  other functions, i'm pretty positive you can come up with implementations easier than what i just did to detect and write memory in the two cases differently, without resorting to this trickery.

If you meant to make the loop iteration space change, i disagree with you that it's allowed even harder :)

> Assuming I correctly understood what you meant by "the set of conditions under which it executes is identical", we won't be able to keep any of the `readonly` attributes.
> 
> - `X` is executed for all `i` < `N`.
> - `A` is executed for all even `i` if `N` > `1`
> - `B` is executed for all odd `i` if `N` > `1`
> - `C` is executed if `N` is `1`.
> 
>   Of course, in the program //trace// the instances in which `f` was executed stay the same in the pre-unroll and post-unroll program; and that's related to what I was trying to say earlier:  with the `speculatable` attribute, the behavior of a program is no longer a property of its trace -- there could be a "hidden" speculatable call somewhere that influences the behavior of the program without actually being executed (in the initial program), by getting hoisted into the executable bits of the program.  In order to mark these programs as "buggy", we will have to rule such "hidden" speculatable calls (that nevertheless have side effects) as tainting the program with UB, despite not being present in the program trace.

Again, this is literally what it means to play with control dependence, so i *don't* find it shocking.

> 
> 
>> the "no cloning" is true of other attributes (you can't clone and apply readonly like is done in the devirt example above)
> 
> Why can't you apply `readonly` to the devirtualization example?

Sorry, meant to delete it. you can because it's dead code, otherwise, you can't.

>> it's just that, being an attribute about control dependence, the effects relate to control dependence.
> 
> Yes -- IMO that's the key problem with speculatable.  Since it says "there is no control dependence", we cannot apply it in a control dependent manner.

Sure, i'd agree with that, but you really cannot work around that.

> 
> 
>>> I agree, however, that we need to think carefully about how to define what speculatable means on an individual call site. Perhaps they're like convergent functions in this regard: you can't introduce new control dependencies (at least not in general).
>> 
>> Definitely true.
>> 
>> Either the CDEQ set of the call must not change , or you must be able to prove that changes cannot impact the function (IE you don't make it any less dead, etc).
> 
> What exactly do you mean by "the CDEQ set"?  I could not find anything easily on Google. [edit: by CDEQ did you mean the cdequiv (control dependence equivalent) set?  If so I think I know what you mean.]

yes, cdequiv, sorry.  it's called CDEQ by some papers, and cdequiv by the other.

It's the equivalence classes of the control dependence graph, basically.  The set of blocks/statements/etc (depends on what level you look) that execute only under the same conditions.

> Btw, is "make it any less dead" == "speculatively execute it", or something more subtle?

yes, that's what we'd call it. but it's not executed in any case, so i guess it'd be "speculatively hoist it".
:)

> 
> 
>> 
>> 
>>   const int foo = bar = fred = 0;
>>   if (foo)
>>   if (bar)
>>   if (fred)
>>      call baz() speculatable
>> 
>> 
>> You can prove hoisting into if (foo) cannot make it any less dead.

https://reviews.llvm.org/D20116