[PATCH] D26437: Use -fno-unit-at-a-time and -funit-at-a-time
Justin Lebar via llvm-commits
llvm-commits at lists.llvm.org
Fri Nov 11 16:59:17 PST 2016
jlebar added a comment.
> This statement confuses me, that is not what I observe:
Sorry, you're right. This is what clang does in cuda mode specifically.
You may want to do the same thing in opencl mode. In CUDA if you have
void wrapper() { my_convergent_barrier_fn(); }
we treat wrapper() as convergent. This is important because otherwise you cannot meaningfully write wrappers around convergent functions (without using __attribute__((convergent)), I guess, but that doesn't exist in CUDA; it's a clang thing). I don't know if you want the same behavior in opencl or not.
> But since that is a subgroup barrier and we have SIMT we do not need to issue any instructions to make sure execution will come to the convergence point simultaneously, so barrier is omitted.
OK. What you want then is to call a llvm intrinsic that is a nop, but is convergent. Then this pass will do the right thing.
Perhaps the semantics in opencl are that clang must insert a call to this special intrinsic at the beginning of every function that is annotated as __attribute__((convergent)).
Repository:
rL LLVM
https://reviews.llvm.org/D26437
More information about the llvm-commits
mailing list