[Clang] Convergent Attribute

Richard Smith via cfe-commits cfe-commits at lists.llvm.org
Fri May 6 14:36:32 PDT 2016


On Fri, May 6, 2016 at 1:56 PM, Ettore Speziale via cfe-commits <
cfe-commits at lists.llvm.org> wrote:

> Hello,
>
> > In the case of foo, there could be a problem.
> > If you do not mark it convergent, the LLVM sink pass push the call to
> foo to the then branch of the ternary operator, hence the program has been
> incorrectly optimized.
> >
> > Really? It looks like the problem is that you lied to the compiler by
> marking the function as 'pure'. The barrier is a side-effect that cannot be
> removed or duplicated, so it's not correct to mark this function as pure.
>
> I was trying to write a very small example to trick LLVM and trigger the
> optimization. It is based on Transforms/Sink/convergent.ll:
>
> define i32 @foo(i1 %arg) {
> entry:
>   %c = call i32 @bar() readonly convergent
>   br i1 %arg, label %then, label %end
>
> then:
>   ret i32 %c
>
> end:
>   ret i32 0
> }
>
> declare i32 @bar() readonly convergent
>

This example looks wrong to me. It doesn't seem meaningful for a function
to be both readonly and convergent, because convergent means the call has
some side-effect visible to other threads and readonly means the call has
no side-effects visible outside the function.

Here is another example:
>
> void foo0(void);
> void foo1(void);
>
> __attribute__((convergent)) void baz() {
>   barrier(CLK_GLOBAL_MEM_FENCE);
> }
>
> void bar(int x, global int *y) {
>   if (x < 5)
>     foo0();
>   else
>     foo1();
>
>   baz();
>
>   if (x < 5)
>     foo0();
>   else
>     foo1();
> }
>

This one looks a lot more interesting. It looks like 'convergent' is a way
of informing LLVM that the call cannot be duplicated, yes? That being the
case, how is this attribute different from the existing
[[clang::noduplicate]] / __attribute__((noduplicate)) attribute?

Based on Transforms/JumpThreading/basic.ll:
>
> define void @h_con(i32 %p) {
>   %x = icmp ult i32 %p, 5
>   br i1 %x, label %l1, label %l2
>
> l1:
>   call void @j()
>   br label %l3
>
> l2:
>   call void @k()
>   br label %l3
>
> l3:
> ; CHECK: call void @g() [[CON:#[0-9]+]]
> ; CHECK-NOT: call void @g() [[CON]]
>   call void @g() convergent
>   %y = icmp ult i32 %p, 5
>   br i1 %y, label %l4, label %l5
>
> l4:
>   call void @j()
>   ret void
>
> l5:
>   call void @k()
>   ret void
> ; CHECK: }
> }
>
> If you do not mark baz convergent, you get this:
>
> clang -x cl -emit-llvm -S -o - test.c -O0 | opt -mem2reg -jump-threading -S
>
> define void @bar(i32 %x) #0 {
> entry:
>   %cmp = icmp slt i32 %x, 5
>   br i1 %cmp, label %if.then2, label %if.else3
>
> if.then2:                                         ; preds = %entry
>   call void @foo0()
>   call void @baz()
>   call void @foo0()
>   br label %if.end4
>
> if.else3:                                         ; preds = %entry
>   call void @foo1()
>   call void @baz()
>   call void @foo1()
>   br label %if.end4
>
> if.end4:                                          ; preds = %if.else3,
> %if.then2
>   ret void
> }
>
> Which is illegal, as the value of x might not be the same for all
> work-items.
>
> I’ll update the patch such as:
>
> * it uses the example about jump-threading
> * it marks the attribute available in OpenCL/Cuda
> * it provides the [[clang::convergent]] attribute
>
> Thanks,
> Ettore Speziale
>
> --------------------------------------------------
> Ettore Speziale — Compiler Engineer
> speziale.ettore at gmail.com
> espeziale at apple.com
> --------------------------------------------------
>
> _______________________________________________
> cfe-commits mailing list
> cfe-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20160506/f96c3caa/attachment.html>


More information about the cfe-commits mailing list