[cfe-dev] [OpenCL] clang can't compile a simple enqueue_kernel with default opt level

Liu, Yaxun (Sam) via cfe-dev cfe-dev at lists.llvm.org
Tue Feb 13 08:06:29 PST 2018


I have a fix for this issue https://reviews.llvm.org/D43240

Thanks.

Sam

From: Anastasia Stulova [mailto:Anastasia.Stulova at arm.com]
Sent: Monday, February 12, 2018 12:56 PM
To: Liu, Yaxun (Sam) <Yaxun.Liu at amd.com>; Bessonova, Kristina <kristina.bessonova at intel.com>; cfe-dev at lists.llvm.org
Cc: Sumner, Brian <Brian.Sumner at amd.com>; nd <nd at arm.com>
Subject: Re: [OpenCL] clang can't compile a simple enqueue_kernel with default opt level


>The wrapper kernel needs to call the block invoke function, which is created by emitOpenCLEnqueuedBlock.

Yes, but do we actually need the block definition for emitting the call (i.e. llvm::Value for the block invoke function)?
auto *V = CGF.EmitBlockLiteral(cast<BlockExpr>(Block), &Invoke);

Could we just recreate the function prototype only, while emitting the kernel body inside createEnqueuedBlockKernel<https://clang.llvm.org/doxygen/classclang_1_1CodeGen_1_1TargetCodeGenInfo.html#ad3d9243e5f91aa7bc815241461b136bd>?
________________________________
From: Liu, Yaxun (Sam) <Yaxun.Liu at amd.com<mailto:Yaxun.Liu at amd.com>>
Sent: 09 February 2018 21:31
To: Anastasia Stulova; Bessonova, Kristina; cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>
Cc: Sumner, Brian; nd
Subject: RE: [OpenCL] clang can't compile a simple enqueue_kernel with default opt level


Thanks Anastasia for investigating the issue.



The wrapper kernel needs to call the block invoke function, which is created by emitOpenCLEnqueuedBlock.



Sam



From: Anastasia Stulova [mailto:Anastasia.Stulova at arm.com]
Sent: Thursday, February 08, 2018 4:05 PM
To: Liu, Yaxun (Sam) <Yaxun.Liu at amd.com<mailto:Yaxun.Liu at amd.com>>; Bessonova, Kristina <kristina.bessonova at intel.com<mailto:kristina.bessonova at intel.com>>; cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>
Cc: Sumner, Brian <Brian.Sumner at amd.com<mailto:Brian.Sumner at amd.com>>; nd <nd at arm.com<mailto:nd at arm.com>>
Subject: Re: [OpenCL] clang can't compile a simple enqueue_kernel with default opt level





It seems one of the assumptions that E in emitOpenCLEnqueuedBlock has BlockExpr type is not right and therefore the assert is triggered inside the static cast. This can be easily fixed by calling  IgnoreImplicit() method.

But then there is another issue due to the generation of the Block during the AST parsing of the block declaration and during the enqueue_kernel builtin generation path. I think we should avoid generating the block literal from the enqueue_kernel generation. This doesn't really match the original compilation flow.



@Sam, I was just wondering whether we could avoid generating the literal inside the emitOpenCLEnqueuedBlock and pass the name of the block  and num of its params to createEnqueuedBlockKernel or perhaps we could pass the BlockExpr? The wrapper kernel function is pretty simple at the end and doesn't require the block invoke function itself (it can just rebuild the prototype). What do you think?



Anastasia

________________________________

From: Liu, Yaxun (Sam) <Yaxun.Liu at amd.com<mailto:Yaxun.Liu at amd.com>>
Sent: 05 February 2018 18:22
To: Bessonova, Kristina; cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>
Cc: Sumner, Brian; Anastasia Stulova
Subject: RE: [OpenCL] clang can't compile a simple enqueue_kernel with default opt level



Right. I am taking a look.



Thanks.



Sam



From: Bessonova, Kristina [mailto:kristina.bessonova at intel.com]
Sent: Monday, February 05, 2018 12:15 PM
To: cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>
Cc: Sumner, Brian <Brian.Sumner at amd.com<mailto:Brian.Sumner at amd.com>>; Liu, Yaxun (Sam) <Yaxun.Liu at amd.com<mailto:Yaxun.Liu at amd.com>>; anastasia.stulova at arm.com<mailto:anastasia.stulova at arm.com>
Subject: [OpenCL] clang can't compile a simple enqueue_kernel with default opt level



Hi all,



I've noticed that clang fails to compile a simple OpenCL kernel with an enqueue_kernel() call if optimization level isn't set:



$ cat test.cl

void foo(size_t id, __global int* out) {

  out[id] = id;

}



kernel void enqueue_foo(__global int* out) {

  size_t id = get_global_id(0);



  void (^fooBlock)(void) = ^{ foo(id, out); };



  queue_t queue = get_default_queue();

  ndrange_t ndrange = ndrange_1D(1);

  enqueue_kernel(queue, CLK_ENQUEUE_FLAGS_WAIT_KERNEL, ndrange, fooBlock);

}



$ build/bin/clang --version

clang version 7.0.0 (https://git.llvm.org/git/clang.git/ 08712fff7fba84b88e2e57b3c739d53b1aab1ed6)



$ clang -cc1 -emit-llvm -x cl -triple spir64-unknown-unknown -finclude-default-header -cl-std=CL2.0 test.cl



clang: ./llvm/include/llvm/Support/Casting.h:255: typename llvm::cast_retty<X, Y*>::ret_type llvm::cast(Y*) [with X = clang::BlockExpr; Y = const clang::Expr; typename llvm::cast_retty<X, Y*

>::ret_type = const clang::BlockExpr*]: Assertion `isa<X>(Val) && "cast<Ty>() argument of incompatible type!"' failed.

...

#9 llvm::cast_retty<clang::BlockExpr, clang::Expr const*>::ret_type llvm::cast<clang::BlockExpr, clang::Expr const>(clang::Expr const*) ./llvm/include/llvm/Support/Casting

.h:257:0

#10 clang::CodeGen::CGOpenCLRuntime::emitOpenCLEnqueuedBlock(clang::CodeGen::CodeGenFunction&, clang::Expr const*) ./llvm/tools/clang/lib/CodeGen/CGOpenCLRuntime.cpp:125:0

#11 clang::CodeGen::CodeGenFunction::EmitBuiltinExpr(clang::FunctionDecl const*, unsigned int, clang::CallExpr const*, clang::CodeGen::ReturnValueSlot) ./llvm/tools/clang/

lib/CodeGen/CGBuiltin.cpp:3017:0

#12 clang::CodeGen::CodeGenFunction::EmitCallExpr(clang::CallExpr const*, clang::CodeGen::ReturnValueSlot) ./llvm/tools/clang/lib/CodeGen/CGExpr.cpp:4218:0

#13 (anonymous namespace)::ScalarExprEmitter::VisitCallExpr(clang::CallExpr const*) ./llvm/tools/clang/lib/CodeGen/CGExprScalar.cpp:517:0

#14 clang::StmtVisitorBase<clang::make_ptr, (anonymous namespace)::ScalarExprEmitter, llvm::Value*>::Visit(clang::Stmt*) ./llvm/build/tools/clang/include/clang/AST/StmtNod

es.inc:329:0

#15 (anonymous namespace)::ScalarExprEmitter::Visit(clang::Expr*) ./llvm/tools/clang/lib/CodeGen/CGExprScalar.cpp:358:0



Clang compiles OpenCL sources with '-O2' optimization level by default. However tests for enqueue_kernel() in clang are compiled with '-O0'.

So, it seems we have a bug here. Am I right?



Thanks,

Kristina

--------------------------------------------------------------------
Joint Stock Company Intel A/O
Registered legal address: Krylatsky Hills Business Park,
17 Krylatskaya Str., Bldg 4, Moscow 121614,
Russian Federation

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20180213/8d9e4fd6/attachment.html>


More information about the cfe-dev mailing list