[PATCH] D38134: [OpenCL] Emit enqueued block as kernel

Yaxun Liu via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Oct 6 08:43:04 PDT 2017


yaxunl marked 6 inline comments as done.
yaxunl added inline comments.


================
Comment at: lib/CodeGen/CGOpenCLRuntime.cpp:144
+  if (auto *I = dyn_cast<llvm::Instruction>(V)) {
+    // If the block literal is emitted as an instruction, it is an alloca
+    // and the block invoke function is stored to GEP of this alloca.
----------------
Anastasia wrote:
> Why do we need to replace original block calls with the kernels? I think in case of calling a block we could use the original block function and only for enqueue use the kernel that would call the block function inside. The pointer to the kernel wrapper could be passed as an additional parameter to `enqueue_kernel` calls. We won't need to iterate through all IR then.
`CGF.EmitScalarExpr(Block)` returns the block literal structure which contains the size/align/invoke_function/captures. The block invoke function is stored to the struct by a `StoreInst`. To create the wrapper kernel, we need to get the block invoke function, therefore we have to iterate through IR.

Since we need to find the store instruction any way, it is simpler to just replace the stored function with the kernel and pass the block literal struct, instead of passing the kernel separately.


================
Comment at: lib/CodeGen/TargetInfo.cpp:8927
+llvm::Function *
+TargetCodeGenInfo::createEnqueuedBlockKernel(CodeGenFunction &CGF,
+                                             llvm::Function *Invoke,
----------------
Anastasia wrote:
> Could you add some comments please?
Will do.


================
Comment at: lib/CodeGen/TargetInfo.cpp:8949
+  Builder.restoreIP(IP);
+  return F;
+}
----------------
Anastasia wrote:
> Wondering if we should add the kernel metadata (w/o args) since it was used for long time to indicate the kernel.
Currently (before this change), clang already does not generate kernel metadata if there is no vec_type_hint, work_group_size_hint, reqd_work_group_size. Remember last time we made the change to use function metadata to represent these attributes. Whether a function is a kernel can be determined by its calling convention.


================
Comment at: lib/CodeGen/TargetInfo.h:35
 class Decl;
+class ASTContext;
 
----------------
Anastasia wrote:
> Do we need this?
Will remove it.


================
Comment at: test/CodeGenOpenCL/cl20-device-side-enqueue.cl:9
 
-// N.B. The check here only exists to set BL_GLOBAL
-// COMMON: @block_G =  addrspace(1) constant void (i8 addrspace(3)*) addrspace(4)* addrspacecast (void (i8 addrspace(3)*) addrspace(1)* bitcast ({ i32, i32, i8 addrspace(4)* } addrspace(1)* [[BL_GLOBAL:@__block_literal_global(\.[0-9]+)?]] to void (i8 addrspace(3)*) addrspace(1)*) to void (i8 addrspace(3)*) addrspace(4)*)
+// COMMON: %struct.__opencl_block_literal_generic = type { i32, i32, i8 addrspace(4)* }
+
----------------
Anastasia wrote:
> Can we check generated kernel function too?
will do.


https://reviews.llvm.org/D38134





More information about the cfe-commits mailing list