[PATCH] D38134: [OpenCL] Emit enqueued block as kernel

Anastasia Stulova via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Oct 10 09:42:50 PDT 2017

Anastasia added inline comments.

Comment at: lib/CodeGen/CGOpenCLRuntime.cpp:144
+  if (auto *I = dyn_cast<llvm::Instruction>(V)) {
+    // If the block literal is emitted as an instruction, it is an alloca
+    // and the block invoke function is stored to GEP of this alloca.
yaxunl wrote:
> Anastasia wrote:
> > Why do we need to replace original block calls with the kernels? I think in case of calling a block we could use the original block function and only for enqueue use the kernel that would call the block function inside. The pointer to the kernel wrapper could be passed as an additional parameter to `enqueue_kernel` calls. We won't need to iterate through all IR then.
> `CGF.EmitScalarExpr(Block)` returns the block literal structure which contains the size/align/invoke_function/captures. The block invoke function is stored to the struct by a `StoreInst`. To create the wrapper kernel, we need to get the block invoke function, therefore we have to iterate through IR.
> Since we need to find the store instruction any way, it is simpler to just replace the stored function with the kernel and pass the block literal struct, instead of passing the kernel separately.
So we cann't get the invoke function from the block literal structure passed into the kernel wrapper directly knowing its offset? Iterating through IR adds extra time and also I am not sure how reliable this is wrt different corner cases of IR.

Comment at: lib/CodeGen/TargetInfo.cpp:8949
+  Builder.restoreIP(IP);
+  return F;
yaxunl wrote:
> Anastasia wrote:
> > Wondering if we should add the kernel metadata (w/o args) since it was used for long time to indicate the kernel.
> Currently (before this change), clang already does not generate kernel metadata if there is no vec_type_hint, work_group_size_hint, reqd_work_group_size. Remember last time we made the change to use function metadata to represent these attributes. Whether a function is a kernel can be determined by its calling convention.
Ok, let's leave it for now. We can always add it in on request. 

Comment at: test/CodeGenOpenCL/cl20-device-side-enqueue.cl:297
+// COMMON: define internal spir_kernel void [[INVG5]](i8 addrspace(4)*{{.*}}, i8 addrspace(3)*{{.*}})
+// COMMON: define internal spir_kernel void [[INVG6]](i8 addrspace(4)*{{.*}}, i8 addrspace(3)*{{.*}}, i8 addrspace(3)*{{.*}}, i8 addrspace(3)*{{.*}})
+// COMMON: define internal spir_kernel void [[INVG7]](i8 addrspace(4)*{{.*}}, i8 addrspace(3)*{{.*}})
Perhaps we could check the body of this one too since it has a different prototype.


More information about the cfe-commits mailing list