[PATCH] D25343: [OpenCL] Mark group functions as convergent in opencl-c.h

Ettore Speziale via cfe-commits cfe-commits at lists.llvm.org
Tue Oct 25 15:12:09 PDT 2016


Hello,

> As far as I understand the whole problem is that the optimized functions are marked by __attribute__((pure)). If the attribute is removed from your example, we get LLVM dump preserving correctness:
> 
> define i32 @bar(i32 %x) local_unnamed_addr #0 {
> entry:
>  %call = tail call i32 @foo() #2
>  %tobool = icmp eq i32 %x, 0
>  %.call = select i1 %tobool, i32 0, i32 %call
>  ret i32 %.call
> }

I’ve used __attribute__((pure)) only to force LLVM applying the transformation and show you an example of incorrect behavior.

This is another example:

void foo();
int baz();

int bar(int x) {
  int y;
  if (x) 
    y = baz();
  foo();
  if (x) 
    y = baz();
  return y;
} 

Which gets lowered into:

define i32 @bar(i32) #0 {
  %2 = icmp eq i32 %0, 0
  br i1 %2, label %3, label %4

; <label>:3                                       ; preds = %1
  tail call void (...) @foo() #2
  br label %7

; <label>:4                                       ; preds = %1
  %5 = tail call i32 (...) @baz() #2
  tail call void (...) @foo() #2
  %6 = tail call i32 (...) @baz() #2
  br label %7

; <label>:7                                       ; preds = %3, %4
  %8 = phi i32 [ %6, %4 ], [ undef, %3 ]
  ret i32 %8
}

As you can see the call sites of foo in the optimized IR are not control-equivalent to the only call site of foo in the unoptimized IR. Now imaging foo is implemented in another module and contains a call to a convergent function — e.f. barrier(). You are going to generate incorrect code.

Bye

--------------------------------------------------
Ettore Speziale — Compiler Engineer
speziale.ettore at gmail.com
espeziale at apple.com
--------------------------------------------------



More information about the cfe-commits mailing list