[cfe-dev] About OpenCL 2.x Dynamic Parallelism
Anastasia Stulova via cfe-dev
cfe-dev at lists.llvm.org
Tue Mar 22 10:40:46 PDT 2016
> I like this idea, but it might not so easy to allocate spaces in global or local address space and tell sub-kernel to access them without draining the storage space. Maybe a memory manager is required.
Yes, I see it as some sort of simplified global/local memory heap support would be required in implementation of enqueue_kernel on the device.
From: Bekket McClane [mailto:bekket.mcclane at gmail.com]
Sent: 21 March 2016 13:48
To: David Chisnall
Cc: cfe-dev at lists.llvm.org; Anastasia Stulova
Subject: Re: [cfe-dev] About OpenCL 2.x Dynamic Parallelism
In [Objective-]C, If the block is expected to persist beyond the lifetime of the caller, then the callee is expected to call _Block_copy to promote it to the heap. The compiler emits copy helpers (and descriptors for captured variables that have trivial copy semantics) that allow this to work with a little bit of support from the blocks runtime library.
For OpenCL, you may want to generalise this slightly to provide different target address spaces for the copy,
I like this idea, but it might not so easy to allocate spaces in global or local address space and tell sub-kernel to access them without draining the storage space. Maybe a memory manager is required.
but note that for __block to work correctly the target address space must be readable (and writeable) in the context of the caller. If you do not support __block variables then this is not an issue.
Fortunately, OpenCL-C 2.x doesn't allow __block attribute : )
It sounds as if OpenCL's requirements are much simpler than [Objective-]C's. I looked at implementing flattening for blocks a few years ago, but it becomes quite complex when a single variable is bound to multiple blocks and the potential performance improvements did not justify the increased complexity. This is not an issue for you though.
It sounds as if OpenCL's blocks are actually far closer to C++ lambdas with a default copy capture than they are to [Objective-]C blocks. It might be cleaner and simpler to treat them as special syntax for lambdas than as special semantics for blocks
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the cfe-dev