[Clang] Convergent Attribute

Richard Smith via cfe-commits cfe-commits at lists.llvm.org
Fri May 6 14:53:24 PDT 2016


On Fri, May 6, 2016 at 2:42 PM, David Majnemer <david.majnemer at gmail.com>
wrote:

> On Fri, May 6, 2016 at 2:36 PM, Richard Smith via cfe-commits <
> cfe-commits at lists.llvm.org> wrote:
>
>> On Fri, May 6, 2016 at 1:56 PM, Ettore Speziale via cfe-commits <
>> cfe-commits at lists.llvm.org> wrote:
>>
>>> Hello,
>>>
>>> > In the case of foo, there could be a problem.
>>> > If you do not mark it convergent, the LLVM sink pass push the call to
>>> foo to the then branch of the ternary operator, hence the program has been
>>> incorrectly optimized.
>>> >
>>> > Really? It looks like the problem is that you lied to the compiler by
>>> marking the function as 'pure'. The barrier is a side-effect that cannot be
>>> removed or duplicated, so it's not correct to mark this function as pure.
>>>
>>> I was trying to write a very small example to trick LLVM and trigger the
>>> optimization. It is based on Transforms/Sink/convergent.ll:
>>>
>>> define i32 @foo(i1 %arg) {
>>> entry:
>>>   %c = call i32 @bar() readonly convergent
>>>   br i1 %arg, label %then, label %end
>>>
>>> then:
>>>   ret i32 %c
>>>
>>> end:
>>>   ret i32 0
>>> }
>>>
>>> declare i32 @bar() readonly convergent
>>>
>>
>> This example looks wrong to me. It doesn't seem meaningful for a function
>> to be both readonly and convergent, because convergent means the call has
>> some side-effect visible to other threads and readonly means the call has
>> no side-effects visible outside the function.
>>
>> Here is another example:
>>>
>>> void foo0(void);
>>> void foo1(void);
>>>
>>> __attribute__((convergent)) void baz() {
>>>   barrier(CLK_GLOBAL_MEM_FENCE);
>>> }
>>>
>>> void bar(int x, global int *y) {
>>>   if (x < 5)
>>>     foo0();
>>>   else
>>>     foo1();
>>>
>>>   baz();
>>>
>>>   if (x < 5)
>>>     foo0();
>>>   else
>>>     foo1();
>>> }
>>>
>>
>> This one looks a lot more interesting. It looks like 'convergent' is a
>> way of informing LLVM that the call cannot be duplicated, yes? That being
>> the case, how is this attribute different from the existing
>> [[clang::noduplicate]] / __attribute__((noduplicate)) attribute?
>>
>
> I think it has more to do with LLVM's definition of convergent: that you
> really do not want control dependencies changing for a callsite.
>

Hmm, so we can't transform:

  %a = complex_pure_operation1
  %b = complex_pure_operation2
  %c = select i1 %x, i32 %a, i32 %b
  call void @foo(i32 %c) convergent

... into ...

  br i1 %x, label %aa, label %bb

aa:
  %a = complex_pure_operation1
  br label %cont

bb:
  %b = complex_pure_operation2
  br label %cont

cont:
  %c = phi i32 [ %a, %aa ],  [ %b, %bb ]
  call void @foo(i32 %c) convergent

?

It looks like we added the noduplicate attribute to clang to support
OpenCL's barrier function. Did we get the semantics for it wrong for its
intended use case?


> http://llvm.org/docs/LangRef.html#function-attributes
>
>
>>
>> Based on Transforms/JumpThreading/basic.ll:
>>>
>>> define void @h_con(i32 %p) {
>>>   %x = icmp ult i32 %p, 5
>>>   br i1 %x, label %l1, label %l2
>>>
>>> l1:
>>>   call void @j()
>>>   br label %l3
>>>
>>> l2:
>>>   call void @k()
>>>   br label %l3
>>>
>>> l3:
>>> ; CHECK: call void @g() [[CON:#[0-9]+]]
>>> ; CHECK-NOT: call void @g() [[CON]]
>>>   call void @g() convergent
>>>   %y = icmp ult i32 %p, 5
>>>   br i1 %y, label %l4, label %l5
>>>
>>> l4:
>>>   call void @j()
>>>   ret void
>>>
>>> l5:
>>>   call void @k()
>>>   ret void
>>> ; CHECK: }
>>> }
>>>
>>> If you do not mark baz convergent, you get this:
>>>
>>> clang -x cl -emit-llvm -S -o - test.c -O0 | opt -mem2reg -jump-threading
>>> -S
>>>
>>> define void @bar(i32 %x) #0 {
>>> entry:
>>>   %cmp = icmp slt i32 %x, 5
>>>   br i1 %cmp, label %if.then2, label %if.else3
>>>
>>> if.then2:                                         ; preds = %entry
>>>   call void @foo0()
>>>   call void @baz()
>>>   call void @foo0()
>>>   br label %if.end4
>>>
>>> if.else3:                                         ; preds = %entry
>>>   call void @foo1()
>>>   call void @baz()
>>>   call void @foo1()
>>>   br label %if.end4
>>>
>>> if.end4:                                          ; preds = %if.else3,
>>> %if.then2
>>>   ret void
>>> }
>>>
>>> Which is illegal, as the value of x might not be the same for all
>>> work-items.
>>>
>>> I’ll update the patch such as:
>>>
>>> * it uses the example about jump-threading
>>> * it marks the attribute available in OpenCL/Cuda
>>> * it provides the [[clang::convergent]] attribute
>>>
>>> Thanks,
>>> Ettore Speziale
>>>
>>> --------------------------------------------------
>>> Ettore Speziale — Compiler Engineer
>>> speziale.ettore at gmail.com
>>> espeziale at apple.com
>>> --------------------------------------------------
>>>
>>> _______________________________________________
>>> cfe-commits mailing list
>>> cfe-commits at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
>>>
>>
>>
>> _______________________________________________
>> cfe-commits mailing list
>> cfe-commits at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20160506/0bb15aa8/attachment-0001.html>


More information about the cfe-commits mailing list