[PATCH] D37822: [OpenCL] Clean up and add missing fields for block struct
    Yaxun Liu via Phabricator via cfe-commits 
    cfe-commits at lists.llvm.org
       
    Fri Sep 15 12:51:56 PDT 2017
    
    
  
yaxunl marked 2 inline comments as done.
yaxunl added a comment.
In https://reviews.llvm.org/D37822#872291, @Anastasia wrote:
> Could you please explain a bit more why the alignment have to be put explicitly in the struct? I am just not very convinced this is general enough.
The captured variables are fields of the block literal struct. Due to alignment requirement of these fields, there is alignment requirement of
the block literal struct. The ISA of the block invoke function is generated with the assumption of these alignments. If the block literal is
allocated at a memory address not satisfying the alignment requirement, the kernel behavior is undefined.
Generally, __enqueue_kernel library function needs to prepare the kernel argument before launching the kernel. It usually does this by copying
the block literal to some buffer then pass the address of the buffer to the kernel. Then the address of the buffer has to satisfy the alignment
requirement.
If this block literal struct is not general enough, how about add another field as target reserved size, and leave the remaining space of header for
target specific use. And add a target hook to allow target fill the reserved space, e.g.
  struct __opencl_block_literal {
    int total_size;
    int align;
    __generic void *invoke;
    int target_reserved_size; /* round up to 4 bytes */
    int target_reserved[];
    /* captures */
  };
================
Comment at: lib/CodeGen/CGBlocks.cpp:314
 
   assert(elementTypes.empty());
-  elementTypes.push_back(CGM.VoidPtrTy);
----------------
Anastasia wrote:
> Why removing this?
It is not removed. It is moved to line 307.
================
Comment at: test/CodeGenOpenCL/blocks.cl:17
   int i;
-// Checking for null instead of @_NSConcreteStackBlock symbol
-// COMMON: store i8* null, i8** %block.isa
+  // COMMON-NOT: store i8* null, i8** %block.isa
+  // COMMON: %[[block_size:.*]] = getelementptr inbounds <{ i32, i32, i8 addrspace(4)*, i32 }>, <{ i32, i32, i8 addrspace(4)*, i32 }>* %block, i32 0, i32 0
----------------
Anastasia wrote:
> We don't need to check other fields too?
will add checks.
https://reviews.llvm.org/D37822
    
    
More information about the cfe-commits
mailing list