[PATCH] D26348: Allow convergent attribute for function arguments

Wed Nov 9 08:04:37 PST 2016

mehdi_amini added inline comments.

================
Comment at: include/llvm/IR/Intrinsics.td:54
+// intrinsic may not be modified in a way that makes the argument value
+// control-dependent on additional values.
+class Convergent<int argNo> : IntrinsicProperty {
----------------
nhaehnle wrote:
> mehdi_amini wrote:
> > nhaehnle wrote:
> > > mehdi_amini wrote:
> > > > nhaehnle wrote:
> > > > > jlebar wrote:
> > > > > > This definition actually seems fairly clear to me, if we s/control-//.  Is there a reason we don't want to use this as our canonical definition?
> > > > > I'd be happy with it. When I posted something like this on the mailing list initially, Mehdi asked for clarification :)
> > > > My concern on the mailing was the original definition was based on "uniformity".
> > > > 
> > > > Anyway, what about this example:
> > > > 
> > > > ```
> > > > if (cond) {
> > > >   Tmp1 = ...
> > > >   Foo(Tmp1 [convergent])
> > > > } else {
> > > >   Tmp2 = ...
> > > >   Foo(Tmp2 [convergent])
> > > > }
> > > > ```
> > > > 
> > > > I think your definition does not prevent:
> > > > 
> > > > ```
> > > > if (cond) {
> > > >   Tmp1 = ...
> > > > } else {
> > > >   Tmp2 = ...
> > > > }
> > > > 
> > > > Tmp3 = phi [Tmp1, Tmp2]
> > > > 
> > > > if (cond) {
> > > >   Foo(Tmp3 [convergent])
> > > > } else {
> > > >   Foo(Tmp3 [convergent])
> > > > }
> > > > ```
> > > > 
> > > > However it breaks uniformity.
> > > That transform is allowed. It will cause some inefficiencies in the code generation, but the semantics are indeed okay.
> > > 
> > > At each call site of Foo, the argument is "dynamically uniform" using the language of the GLSL specification, i.e. all threads that actually go through the call site will see the same value (of course assuming that this was true for the original program). So maybe it's really just a matter of finding the right wording.
> > OK let's just do one extra transformation:
> > 
> > ```
> > if (cond) {
> >   Tmp1 = ...
> > } else {
> >   Tmp2 = ...
> > }
> > 
> > Tmp3 = phi [Tmp1, Tmp2]
> > Foo(Tmp3 [convergent])
> > ```
> That transformation is incorrect, and this RFC patch already disables it (in SimplifyCFG/tail sinking). A simpler example would be just:
> 
>   if (cond) {
>     Tmp1 = Foo(v [convergent])
>   } else {
>     Tmp2 = Foo(v [convergent])
>   }
>   Tmp3 = phi [Tmp1, Tmp2]
> 
> to
> 
>   Tmp3 = Foo(v [convergent])
> 
> This transform is incorrect in general because cond could be something like `v == 0`, where v is an int that can only take values 0 or 1.
> 
> Formally, the constraint formulation with runs r1 and r2 forbids this transform because you can choose r1 = { cond = true, v = A }, r2 = { cond = false, v = B }. These runs are compatible in the original program, but incompatible in the transformed program. (This is basically what happens in the linked bug report.)
> 
> The transform would be correct if the branch condition were proved to be "uncorrelated" with the convergent argument. This can sometimes be done with target-specific knowledge. This target-specific knowledge effectively restricts which pairs of runs need to be considered in the runs-based constraint formulation.
I'm addressing the definition proposed here: " The values to this argument are convergent, and calls to the intrinsic may not be modified in a way that makes the argument value control-dependent on additional values."

I don't see how any of *this* two definition forbids the transformation I mentioned.

Or the one in LangRef: "When a function argument has the ``convergent`` attribute, a call to the function may only be transformed in a way that does not add additional sources of divergence to the argument."

It is using the concept of "divergence" without defining it, so this one is more iffy, but still not clear.

https://reviews.llvm.org/D26348