[LLVMdev] Loads moving across barriers

Fri Dec 6 14:17:35 PST 2013

On 12/04/2013 05:25 PM, Andrew Trick wrote:
>
> On Dec 4, 2013, at 5:19 PM, Matt Arsenault <Matthew.Arsenault at amd.com 
> <mailto:Matthew.Arsenault at amd.com>> wrote:
>
>> On 12/04/2013 04:29 PM, Andrew Trick wrote:
>>> On Dec 4, 2013, at 3:33 PM, Matt Arsenault 
>>> <Matthew.Arsenault at amd.com <mailto:Matthew.Arsenault at amd.com>> wrote:
>>>
>>>> On 11/11/2013 03:13 PM, Andrew Trick wrote:
>>>>> On Nov 9, 2013, at 1:39 PM, Matt Arsenault <arsenm2 at gmail.com 
>>>>> <mailto:arsenm2 at gmail.com>> wrote:
>>>>>
>>>>>> On Nov 9, 2013, at 3:14 AM, Chandler Carruth 
>>>>>> <chandlerc at google.com <mailto:chandlerc at google.com>> wrote:
>>>>>>
>>>>>>> Perhaps you're instead trying to say that with certain address 
>>>>>>> spaces "noalias" (and by inference, "restrict" at the language 
>>>>>>> level) has a different semantic model than other address spaces? 
>>>>>>> While it's less worrisome than the first interpretation, I still 
>>>>>>> don't really like it.
>>>>>>>
>>>>>> This sounds right. With the constant address space, anything you 
>>>>>> do is OK since it’s constant. Private address space is supposed 
>>>>>> to be totally inaccessible from other workitems, so parallel 
>>>>>> modifications aren’t a concern. The others require explicit 
>>>>>> synchronization which noalias would need to be aware of.
>>>>> FWIW, it seems generally useful to me to have a nomemfence 
>>>>> function attribute and intrinsic property. We should avoid memory 
>>>>> optimization (and possibly other optimization) across these 
>>>>> regardless of alias analysis.
>>>>>
>>>> I'm think I'll try implementing this. Ideally it would be 
>>>> parameterized over the address space, so it makes more sense for it 
>>>> to be a memfence attribute rather than a nomemfence. You would then 
>>>> have an arbitrary number of memfence(N) attributes for each 
>>>> required address space.
>>> So for correctness, would we need to tag all functions with 
>>> memfence(0..M) until we can prove otherwise? That seem heinous.
>> I was thinking the absence of it would mean no memfence in any 
>> address space, which is the current behavior. This adds the option of 
>> fencing.
>>> Better to have an optional attribute that can be added to expose 
>>> optimization. Is it important in practice to optimize the case of 
>>> memfence(I) + nomemfence(J)?
>> I think it would be important for the GPU case. You never need a 
>> memfence for private address space / addrspace 0, but you frequently 
>> want them for local or global. The local or global writes can't be 
>> reordered, but it could be very beneficial to move the private 
>> accesses across fences which might help reduce register usage.
>>
>>>  If so, is there a problem with nomemfence(N)?
>> nomemfence is the current assumption made on an arbitrary call, and 
>> it's the common case. Specifying the absence of a fence seems 
>> backwards of how this is used and more cumbersome to deal with. To 
>> match the current behavior, it would require littering nomemfence for 
>> any possible address space everywhere. In OpenCL you specify your 
>> fences, so it would be more straightforward to map that. If I have a 
>> memfence intrinsic, I just need to mark it with the fence attribute, 
>> and then propogate it to its callers. There would generally only be a 
>> few of them in any program compared to fenceless calls. To implement 
>> this with nomemfence, I would have to mark every function with at 
>> least 4 nomemfences, and remove them when encountering the memfence 
>> intrinsic.
>
> Sure, but the program still needs to be correct if you skip attribute 
> propagation.
> -Andy

Is that actually a real concern? My main problem with nomemfence is how 
do you mark a function as not fencing any other address space you might 
care about around call sites? I suppose nomemfence without an address 
space could indicate nomemfence for any address space, but then that 
just restricts the problem to when you do have a few fenced address 
spaces. How do you know what other address spaces are relevant to be 
marked? Add a nomemfence for any address spaces encountered in functions 
with call sites? What if those in another module?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131206/09e5a8f2/attachment.html>