[cfe-dev] [OpenCL] clang can't compile a simple enqueue_kernel with default opt level

Anastasia Stulova via cfe-dev cfe-dev at lists.llvm.org
Thu Feb 8 13:05:25 PST 2018


It seems one of the assumptions that E in emitOpenCLEnqueuedBlock has BlockExpr type is not right and therefore the assert is triggered inside the static cast. This can be easily fixed by calling  IgnoreImplicit() method.

But then there is another issue due to the generation of the Block during the AST parsing of the block declaration and during the enqueue_kernel builtin generation path. I think we should avoid generating the block literal from the enqueue_kernel generation. This doesn't really match the original compilation flow.

@Sam, I was just wondering whether we could avoid generating the literal inside the emitOpenCLEnqueuedBlock and pass the name of the block  and num of its params to createEnqueuedBlockKernel or perhaps we could pass the BlockExpr? The wrapper kernel function is pretty simple at the end and doesn't require the block invoke function itself (it can just rebuild the prototype). What do you think?

Anastasia

________________________________
From: Liu, Yaxun (Sam) <Yaxun.Liu at amd.com>
Sent: 05 February 2018 18:22
To: Bessonova, Kristina; cfe-dev at lists.llvm.org
Cc: Sumner, Brian; Anastasia Stulova
Subject: RE: [OpenCL] clang can't compile a simple enqueue_kernel with default opt level


Right. I am taking a look.



Thanks.



Sam



From: Bessonova, Kristina [mailto:kristina.bessonova at intel.com]
Sent: Monday, February 05, 2018 12:15 PM
To: cfe-dev at lists.llvm.org
Cc: Sumner, Brian <Brian.Sumner at amd.com>; Liu, Yaxun (Sam) <Yaxun.Liu at amd.com>; anastasia.stulova at arm.com
Subject: [OpenCL] clang can't compile a simple enqueue_kernel with default opt level



Hi all,



I’ve noticed that clang fails to compile a simple OpenCL kernel with an enqueue_kernel() call if optimization level isn’t set:



$ cat test.cl

void foo(size_t id, __global int* out) {

  out[id] = id;

}



kernel void enqueue_foo(__global int* out) {

  size_t id = get_global_id(0);



  void (^fooBlock)(void) = ^{ foo(id, out); };



  queue_t queue = get_default_queue();

  ndrange_t ndrange = ndrange_1D(1);

  enqueue_kernel(queue, CLK_ENQUEUE_FLAGS_WAIT_KERNEL, ndrange, fooBlock);

}



$ build/bin/clang --version

clang version 7.0.0 (https://git.llvm.org/git/clang.git/ 08712fff7fba84b88e2e57b3c739d53b1aab1ed6)



$ clang -cc1 -emit-llvm -x cl -triple spir64-unknown-unknown -finclude-default-header -cl-std=CL2.0 test.cl



clang: ./llvm/include/llvm/Support/Casting.h:255: typename llvm::cast_retty<X, Y*>::ret_type llvm::cast(Y*) [with X = clang::BlockExpr; Y = const clang::Expr; typename llvm::cast_retty<X, Y*

>::ret_type = const clang::BlockExpr*]: Assertion `isa<X>(Val) && "cast<Ty>() argument of incompatible type!"' failed.

…

#9 llvm::cast_retty<clang::BlockExpr, clang::Expr const*>::ret_type llvm::cast<clang::BlockExpr, clang::Expr const>(clang::Expr const*) ./llvm/include/llvm/Support/Casting

.h:257:0

#10 clang::CodeGen::CGOpenCLRuntime::emitOpenCLEnqueuedBlock(clang::CodeGen::CodeGenFunction&, clang::Expr const*) ./llvm/tools/clang/lib/CodeGen/CGOpenCLRuntime.cpp:125:0

#11 clang::CodeGen::CodeGenFunction::EmitBuiltinExpr(clang::FunctionDecl const*, unsigned int, clang::CallExpr const*, clang::CodeGen::ReturnValueSlot) ./llvm/tools/clang/

lib/CodeGen/CGBuiltin.cpp:3017:0

#12 clang::CodeGen::CodeGenFunction::EmitCallExpr(clang::CallExpr const*, clang::CodeGen::ReturnValueSlot) ./llvm/tools/clang/lib/CodeGen/CGExpr.cpp:4218:0

#13 (anonymous namespace)::ScalarExprEmitter::VisitCallExpr(clang::CallExpr const*) ./llvm/tools/clang/lib/CodeGen/CGExprScalar.cpp:517:0

#14 clang::StmtVisitorBase<clang::make_ptr, (anonymous namespace)::ScalarExprEmitter, llvm::Value*>::Visit(clang::Stmt*) ./llvm/build/tools/clang/include/clang/AST/StmtNod

es.inc:329:0

#15 (anonymous namespace)::ScalarExprEmitter::Visit(clang::Expr*) ./llvm/tools/clang/lib/CodeGen/CGExprScalar.cpp:358:0



Clang compiles OpenCL sources with ‘-O2’ optimization level by default. However tests for enqueue_kernel() in clang are compiled with ‘-O0’.

So, it seems we have a bug here. Am I right?



Thanks,

Kristina

--------------------------------------------------------------------
Joint Stock Company Intel A/O
Registered legal address: Krylatsky Hills Business Park,
17 Krylatskaya Str., Bldg 4, Moscow 121614,
Russian Federation

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20180208/22fecc0d/attachment.html>


More information about the cfe-dev mailing list