[LLVMdev] Add a new llvm intrinsic?

Owen Anderson resistor at mac.com
Mon Nov 11 11:09:00 PST 2013


Hi Jeff,

It’s not really meaningful to talk about threads being created in the context of an OpenCL kernel.  The other threads are always present.

void kernel(int * restrict array, int * restrict array2) {
	int value = array[0] + get_thread_id() + 1;
	barrier();
	array[get_thread_id()] = value;
	barrier();
	array2[get_thread_id()] = array[0];
}

In this example code, the kernel is well synchronized; there are no data races on any elements of either array.  However, the results will differ if we CSE the later read of array[0] with the earlier one.  Executed as written, the final value of array2[0] will be array[0]+1.  If we perform the CSE, the result will be just array[0].

—Owen

On Nov 10, 2013, at 9:16 PM, Jeffrey Yasskin <jyasskin at googlers.com> wrote:

> Sorry for the delay in getting back to you. I don't know if anything
> came out of this, since Xiaoyi never wrote back. What does some of the
> affected code look like? My opinion is still that 'restrict' should
> mean that no other thread should use a pointer aliasing the restrict
> pointer, although if many threads are started after the lifetime of
> the restrict pointer starts, and they depend on the value of the
> restrict pointer, and they're joined, and then a use of the restrict
> pointer is moved ahead of the join so that it races with the other
> threads that depend on the restrict pointer, that's definitely an LLVM
> bug.
> 
> On Fri, Nov 8, 2013 at 1:50 PM, Owen Anderson <resistor at mac.com> wrote:
>> Hi Jeff,
>> 
>> Do you know if anything came of this?  I understand we may need to seek
>> clarification to get a formal answer, particularly with respect to C, but it
>> seems pretty clear to me that this is a significant QoI issue, both for C
>> and CL.  LLVM is effectively hoisting a load above a thread-join.  This may
>> or may not technically allowed in C, but it seems generally undesirable, and
>> it’s extremely undesirable in CL where these kinds of thread joins are a
>> fundamental of the programming model.
>> 
>> —Owen
>> 
>> 
>> On Aug 6, 2013, at 5:36 PM, Jeffrey Yasskin <jyasskin at googlers.com> wrote:
>> 
>> Chandler pointed out another interpretation of C11/6.7.3.1, in which
>> 'restrict' only addresses aliasing within a single thread. If that's
>> the right interpretation, then it's a bug in LLVM that it moves
>> noalias pointers across memory-ordering operations at all, and you
>> still don't need a new fence, just a bug fix.
>> 
>> 6.7.3.1 says "During each execution of B, ...". "During" could either
>> mean just within the same thread or within any segment of a thread
>> that doesn't happen-before or happen-after B.
>> 
>> It's a defect in C that this is ambiguous. Anyone want to volunteer to
>> send it to the committee? (I'll be happy to proofread, etc., just not
>> be in charge of finding the right email target)
>> 
>> On Tue, Aug 6, 2013 at 5:01 PM, Jeffrey Yasskin <jyasskin at googlers.com>
>> wrote:
>> 
>> This sounds a lot like the question at
>> http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-July/064462.html. It
>> sounds like you have a pointer marked 'restrict', but it's actually
>> aliased in another thread. That would be undefined behavior even with
>> a stronger fence.
>> 
>> On Tue, Aug 6, 2013 at 4:56 PM, Guo, Xiaoyi <Xiaoyi.Guo at amd.com> wrote:
>> 
>> Hi,
>> 
>> In OpenCL, the "barrier()" function, as well as various target specific
>> memory fence intrinsics, should prevent loads/stores of the relevant address
>> space from being moved across them.
>> Kernel pointers with "restrict" attributes are implemented by marking the
>> pointer "noalias" in the LLVMIR. However, in LLVM, "noalias" pointers are
>> not affected by llvm memory fence instructions.
>> 
>> To make sure all loads/stores, including those accessing "restrict" pointers
>> are not moved across the barrier/fence intrinsics, we have considered using
>> customized alias analysis passes. However, we would like to move away from
>> using customized passes and would like to use standard llvm mechanisms as
>> much as possible.
>> 
>> What do people think about adding an llvm intrinsic, something like
>> llvm.opencl.mem_fence(i32) (or named something that doesn't have opencl in
>> the name, llvm.addrspace_fence?), which acts as a fence for a single given
>> address space (assuming again that there's no problem with implementing
>> these things as a series of different functions to get the full effect), and
>> which prevents even noalias pointers from being moved across it?
>> 
>> Alternatively (possibly nicer) would be something that looks like the memset
>> intrinsic, which can work for any address space.
>> llvm.addrspace_fence.p1.p2(void)
>> llvm.addrspace_fence.p1(void) ...
>> 
>> Thanks,
>> Xiaoyi
>> 
>> 
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>> 
>> 
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>> 
>> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131111/075b8247/attachment.html>


More information about the llvm-dev mailing list