[cfe-dev] Question about arrays in blocks

David Chisnall David.Chisnall at cl.cam.ac.uk
Thu Jun 20 02:48:16 PDT 2013


On 20 Jun 2013, at 07:45, "Adler, Arik" <arik.adler at intel.com> wrote:

> Another alternative is to copy the pointer and not the whole array (as it done in C)

This is problematic, because it will work fine if the block is only passed down the stack, but cause stack corruption when the block is captured, which is a clear POLA violation.  If the pointer (not the array) is used in the block, then it's safe to assume that the user knows what she is doing in terms of memory management (or, at least, doesn't get to complain if she doesn't).  

Turning this into a warning, perhaps with a configurable size for the smallest thing to complain about might be interesting.  It seems odd to me that we can put a 64 element array inside a stack-allocated C++ object and have that transparently moved to the heap if required, but we can't have a 4-element array by itself.  

It's also worth noting that each time you move a block from the stack to the heap you are doing at least two heap allocations[1], which are likely to be significantly more than a short memcpy to move this array to the new allocation.  The extra overhead when on the stack of indirecting via the block pointer is constant, not proportional to the size of the allocation.  

I'm a bit confused by the original premise of using blocks within OpenCL, however, as much of the hardware that OpenCL is intended to target does not permit dynamic allocations.  Many of these constraints would go away if this were the case.  We'd also be able to eliminate the second level of indirection in the blocks ABI and much of the metadata, as this is only required to permit copying blocks to the heap.  There are other issues related to the ABI that require dereferencing arbitrary pointers to stack memory, which are not permitted by some GPU architectures and are definitely something that you'd want to allow for WebCL (because they make validation of the IR a nightmare - not an impossible task, but certainly a task of sufficient complexity that I wouldn't want security to be dependent on a correct implementation of it).  

David

[1] The blocks ABI was carefully designed to allow multiple captured objects to be stored in a single byref structure, but I believe that we currently only store a single one.  If we have multiple objects that are captured by the same set of blocks, then we can put them in the same byref structure and save some memory allocations (and fragmentation), but I don't believe that we do this.



More information about the cfe-dev mailing list