[llvm-dev] Memory barrier problem
    Johannes Doerfert via llvm-dev 
    llvm-dev at lists.llvm.org
       
    Fri Feb 12 11:41:32 PST 2021
    
    
  
On 2/4/21 2:04 AM, Jeroen Dobbelaere wrote:
>>> So a weaker `noalias` or a way to mark uses seems therefore required for
>>> `noalias` deduction.
>> Appears to be that way. Can we do that w/o having a weaker restrict in the
>> language spec?
> The full restrict[0] implementation does not depend on the 'noalias' attribute on
> function arguments. The attribute is even too strong for just mapping a
> 'C99 restrict pointer argument' to a 'LLVM-IR noalias pointer argument'.
> For backwards compatibility, I kept the default mapping of restrict pointer arguments
> onto 'noalias' and provided the '-fno-noalias-arguments' option to disable this mapping
> For some code, this can result in a wrong 'based on' deduction.[1]
>
> Given that, IMHO, it still makes sense to have a strong and a weaker version of the
> 'noalias argument attribute'. At least, the stronger (current 'noalias') version can be
> converted to the noalias scope/intrinsics mapping during inlining, keeping the strong guarantees.
> Converting a weaker version likely will need some more tweaking.
>
> The noalias attribute is also used for a struct pointer argument when a function returns a struct.
Interesting. I guess we would not keep it for restrict if it is too 
strong but there
are other uses where the guarantees are useful I believe.
Given that full restrict will make the `__restrict` problem go away, 
let's look at the
deduction one.
What if we make `nosync` a value/pointer attribute as well and then have:
   `noalias`
   Does not alias other pointers in scope but synchronizing events might 
still change
   the value because other threads might have the same "expression". 
That is, we declare
   the deductions as correct by weakening `noalias`.
   `nosync`
   The value is not modified in this scope by another thread.
   `noalias` + `nosync`
   Matches the `__restrict` guarantee that nothing not based of the 
pointer can modify it,
   so this is "stronger" than synchronization events and you can forward 
over fences/barriers.
WDYT?
~ Johannes
>
> Greetings,
>
> Jeroen Dobbelaere
> [0] Full Restrict patches: https://reviews.llvm.org/D68484
> [1] 'clang/test/CodeGen/restrict/arg_reuse.c' testcase in: https://reviews.llvm.org/D68521
>
>> -----Original Message-----
>> From: Johannes Doerfert <johannesdoerfert at gmail.com>
>> Sent: Wednesday, February 3, 2021 10:52 AM
>> To: Jeroen Dobbelaere <Jeroen.Dobbelaere at synopsys.com>; Saito, Hideki
>> <hideki.saito at intel.com>; Kaylor, Andrew <andrew.kaylor at intel.com>
>> Cc: llvm-dev at lists.llvm.org
>> Subject: Re: [llvm-dev] Memory barrier problem
>>
>>
>> On 2/3/21 12:44 PM, Jeroen Dobbelaere wrote:
>>>>>> W.r.t. restrict, I'd like to hear more from the language lawyers on
>>>>>> their
>>>> original intent when the language construct was born and the current
>>>> interpretation of it in the presence of threading.
>>>>> I would have assumed `__restrict` predates "common" multi-processing in C.
>>>> Since the language of restrict is to this day implying other threads
>>>> cannot access those pointers, I would not dare to argue we should
>>>> weaken it in order to deduce `noalias`.
>>>>> ~ Johannes
>>>>>
>>> Having interacted recently with wg14 to get a better understanding
>>> about some of the corner cases around restrict, I can add the following:
>>>
>>> One way to look at a restrict pointer[1], is as if you get a local array.
>>> That means that following code:
>>>
>>>     void foo_a(int *restrict rpDest, int *restrict rpSrc, int n) {
>>>        for (int i=0; i<n; ++i)
>>>          rpDest[i] = rpSrc[i]+1;
>>>     }
>>>
>>> is allowed to behave as if it was written as follows:
>>>     void foo_b(int *pDest, int *pSrc, int n) {
>>>       int localDest[n];
>>>       int localSrc[n];
>>>       memcpy(&localDest[0], pDest, n*sizeof(int));
>>>       memcpy(&localSrct[0], pSrc, n*sizeof(int));
>>>       for (int i=0; i<n; ++i)
>>>          localDest[i] = localSrc[i]+1;
>>>       memcpy(pDest, &localDest[0], n*sizeof(int));
>>>     }
>>>
>>> Calling foo_a and foo_b with overlapping arrays can show different
>>> results, depending on how the loop was optimized. That is an
>>> indication that this usage of 'foo_a' is triggering undefined behavior and
>> should not be done.
>>
>> The way I interpret this is consistent with Eli's opinion and what we
>> basically do so far, restrict is stronger than synchronization since the local
>> arrays are not synchronized across threads. If two threads access the same
>> memory (even well synchronized) it breaks the restrict requirement and is
>> therefor UB.
>>
>> So a weaker `noalias` or a way to mark uses seems therefore required for
>> `noalias` deduction.
>>
>> ~ Johannes
>>
>>
>>> Wrt to threading: as long as the restrict pointer (rpDest, rpSrc;
>>> localDest, localSrc) is not escaping, a different thread should not be able
>> to access the memory, as there is no way it can get a pointer 'based on'
>>> the restrict pointer.
>>>
>>> Note [1]: things get more interesting when having a 'pointer to a restrict
>> pointer' (aka int *restrict *prp).
>>> Greetings,
>>>
>>> Jeroen Dobbelaere
>>>
    
    
More information about the llvm-dev
mailing list