[PATCH] D26348: Allow convergent attribute for function arguments

Nicolai Hähnle via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Dec 12 08:58:09 PST 2016


nhaehnle added a comment.

Thank you for taking a look!



================
Comment at: docs/LangRef.rst:1147
+    In some parallel execution models, there exist operations with one or more
+    arguments that must be uniform across threads. Such arguments are called
+    ``convergent``.
----------------
sanjoy wrote:
> (Based on what I've heard about GPUs) Should you emphasize that they're executed in "lockstep"?  Or is that not a correct statement?
It is a correct statement; whether it needs to be emphasized is anyone's guess ;-) I didn't think so, but it seems that this a bit of a point of contention, so probably something like it is required. Did you see @mehdi_amini 's alternative wording for this paragraph? Copying it here:

> GPUs (and other similar hardware) are frequently based on a SIMT (or SPMD) model, where a single hardware execution unit executes identical operations simultaneously on behalf of multiple threads of execution. Some of these operations require that one or more arguments of the operation must have the same value across all simultaneously executing threads. Such arguments are called ``convergent``.

I'd be fine with it, and I think it's fairly clear.


================
Comment at: docs/LangRef.rst:1158
+
+    For every function F, transformations must ensure that every pair of
+    executions of F that is "compatible" with respect to convergent function
----------------
sanjoy wrote:
> I think you need to be specific and explicit about how you're relating "executions" between the pre-transform and the post-transform functions.  For instance if the pre-transform function is:
> 
> ```
> void f(int* val dereferenceable(8)) {
>   int k = *val
> }
> ```
> 
> and the post-transform program is:
> 
> ```
> void f(int* val dereferenceable(8)) {
>   int k0 = *(long *)val;
>   int k = k & 0xff..ff;
> }
> ```
> 
> an execution of the pre-transform program where `k = 400` may not make sense in the context of the post-transform program since you need to produce a 8 byte value for `k0`.
> 
As far as I'd understand it, the initial state of memory is part of what defines an execution. Since the parameter is dereferenceable(8), that includes 8 bytes of memory even in the original program.


================
Comment at: docs/LangRef.rst:1175
+        if (cond) {
+          Tmp1 = Foo(v [convergent])
+        } else {
----------------
sanjoy wrote:
> What if `Foo` is a function pointer that we loaded from some place (and thus we don't have a declaration that we can inspect to know for sure if the argument is marked `convergent`)?  This rule would imply we can't do this sinking transform in that case.
In that case, we may not have a way to specify that the parameter is convergent. This may be a gap in this patch (I haven't actually checked), but not a gap that needs to be filled immediately.

This is similar to the convergent attribute of functions: if you're calling through a function pointer, you may want to add the convergent attribute to the call-site. As long as that attribute isn't present, you can assume that the called function is not convergent. The same holds for the function parameter attribute.


================
Comment at: docs/LangRef.rst:1199
+    Calling a function F that contains a call-site CS with convergent function
+    arguments is only defined behavior if the only data-dependencies of the
+    convergent function arguments of CS are convergent parameters of F. If this
----------------
sanjoy wrote:
> mehdi_amini wrote:
> > sanjoy wrote:
> > > This is a little tricky -- what if the source program is:
> > > 
> > > ```
> > > void f(int convergent i);
> > > void g(int convergent j, int /* normal */ k) {
> > >   f(j - j + k);
> > > }
> > > ```
> > > 
> > > Can we simplify the call site to call `f(k)`?  If so, did we introduce UB in functions that call `g`?
> > Isn't your source program already UB? You have a call site with an argument that is data-dependent on a non-convergent parameter, `k`.
> > 
> > The kind of annoying and SSA-breaking limitation of convergent would be:
> > 
> > ```
> > void f(int convergent i);
> > void g(int convergent j, int /* normal */ k) {
> >   if (j == k)
> >      f(j);
> > }
> > ```
> > 
> > You can't replace `f(j);` with `f(k);` IIUC.
> > 
> > 
> You're right, I misread "only data-dependencies" as "there must be at least one convergent data dependencies".
> 
> The other interesting corner case here is (assuming I did not misread this second time around) around PHI nodes and selects -- if we have:
> 
> ```
> void @other_convergent(int convergent x);
> void @f(int convergent k, int /* normal */ l) {
> entry:
>   br cond label left, label merge
> 
> left:
>   br label merge
> 
> merge:
>   val = PHI [ k, entry ], [ l, left ]
>   call @other_convergent(val);
> }
> ```
> 
> Is calling the function above well defined if `cond` is is always `false`?  What happens if we convert the phi to a select?  What happens when we convert the select to arithmetic?
> 
@mehdi_amini: This example is interesting, because the transform breaks the condition of the definition, but it is correct in the target semantics. This is very unfortunate, and I don't have an immediate answer for it.

@sanjoy's second example is similar. My first intuition would actually have been to say that in all variations, calling the function is undefined. But that's not great if in the future you want to map the semantics of higher-level languages directly to LLVM, including function calls, because there the example is perfectly fine.

Anyway, @mehdi_amini's example is the bigger problem. Forcing GVN and others to keep track of the convergent-ness is clearly a bad idea. Thinking out loud, maybe the job could be done by a function attribute that indicates that any parameter may have to be treated as convergent.


https://reviews.llvm.org/D26348





More information about the llvm-commits mailing list