[PATCH 3/8] R600/SI: Add global atomicrmw and

Tue Oct 14 20:37:00 PDT 2014

On 10/14/2014 08:23 PM, Aaron Watry wrote:
> You are correct that the tests are named as if they use offsets, but
> they don't (copy/paste propagation from the existing atomic add
> tests).
>
> I can:
> 1) Change the tests in this series (and the existing global atomicrmw
> add) to all use offsets
> 2) Remove _offset from the test names.
> 3) Clone all of the tests in this series so that everything has both
> an offset and no-offset version, giving us 8 tests per atomicrmw
> operation (ret/no-ret  * offset/no-offset * i32/i64 addressing).
>
> I can do any of the above, just let me know which you'd prefer and
> I'll whip up a v2 of the patches in this series.
>
> --Aaron
I'm fine with either option 1 or 3


>
> On Wed, Oct 8, 2014 at 5:50 PM, Matt Arsenault
> <Matthew.Arsenault at amd.com> wrote:
>> On 10/08/2014 03:05 PM, Aaron Watry wrote:
>>> Signed-off-by: Aaron Watry <awatry at gmail.com>
>>> ---
>>>    lib/Target/R600/AMDGPUInstructions.td |  1 +
>>>    lib/Target/R600/SIInstructions.td     |  4 +++-
>>>    test/CodeGen/R600/global_atomics.ll   | 38
>>> +++++++++++++++++++++++++++++++++++
>>>    3 files changed, 42 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/lib/Target/R600/AMDGPUInstructions.td
>>> b/lib/Target/R600/AMDGPUInstructions.td
>>> index ef656b9..972ef1d 100644
>>> --- a/lib/Target/R600/AMDGPUInstructions.td
>>> +++ b/lib/Target/R600/AMDGPUInstructions.td
>>> @@ -387,6 +387,7 @@ class global_binary_atomic_op<SDNode atomic_op> :
>>> PatFrag<
>>>    >;
>>>      def atomic_add_global : global_binary_atomic_op<atomic_load_add>;
>>> +def atomic_and_global : global_binary_atomic_op<atomic_load_and>;
>>>    def atomic_sub_global : global_binary_atomic_op<atomic_load_sub>;
>>>
>>> //===----------------------------------------------------------------------===//
>>> diff --git a/lib/Target/R600/SIInstructions.td
>>> b/lib/Target/R600/SIInstructions.td
>>> index 0c8ad65..86dde46 100644
>>> --- a/lib/Target/R600/SIInstructions.td
>>> +++ b/lib/Target/R600/SIInstructions.td
>>> @@ -908,7 +908,9 @@ defm BUFFER_ATOMIC_SUB : MUBUF_Atomic <
>>>    //def BUFFER_ATOMIC_UMIN : MUBUF_ <0x00000036, "BUFFER_ATOMIC_UMIN",
>>> []>;
>>>    //def BUFFER_ATOMIC_SMAX : MUBUF_ <0x00000037, "BUFFER_ATOMIC_SMAX",
>>> []>;
>>>    //def BUFFER_ATOMIC_UMAX : MUBUF_ <0x00000038, "BUFFER_ATOMIC_UMAX",
>>> []>;
>>> -//def BUFFER_ATOMIC_AND : MUBUF_ <0x00000039, "BUFFER_ATOMIC_AND", []>;
>>> +defm BUFFER_ATOMIC_AND : MUBUF_Atomic <
>>> +  0x00000039, "BUFFER_ATOMIC_AND", VReg_32, i32, atomic_and_global
>>> +>;
>>>    //def BUFFER_ATOMIC_OR : MUBUF_ <0x0000003a, "BUFFER_ATOMIC_OR", []>;
>>>    //def BUFFER_ATOMIC_XOR : MUBUF_ <0x0000003b, "BUFFER_ATOMIC_XOR", []>;
>>>    //def BUFFER_ATOMIC_INC : MUBUF_ <0x0000003c, "BUFFER_ATOMIC_INC", []>;
>>> diff --git a/test/CodeGen/R600/global_atomics.ll
>>> b/test/CodeGen/R600/global_atomics.ll
>>> index a676109..09a039b 100644
>>> --- a/test/CodeGen/R600/global_atomics.ll
>>> +++ b/test/CodeGen/R600/global_atomics.ll
>>> @@ -38,6 +38,44 @@ entry:
>>>      ret void
>>>    }
>>>    +; FUNC-LABEL: {{^}}atomic_and_i32_offset:
>>> +; SI: BUFFER_ATOMIC_AND v{{[0-9]+}}, s[{{[0-9]+}}:{{[0-9]+}}], 0{{$}}
>>> +define void @atomic_and_i32_offset(i32 addrspace(1)* %out, i32 %in) {
>>> +entry:
>>> +  %0  = atomicrmw volatile and i32 addrspace(1)* %out, i32 %in seq_cst
>>> +  ret void
>>> +}
>> These tests have offset in the name, but they aren't using an offset. The
>> local atomics tests have offset and no-offset versions. I suppose only one
>> version is really needed, although I would prefer them having the offsets to
>> make sure the constant offset addressing mode folding also works for the
>> atomics. This also goes for all of the others.
>>
>>
>>> +
>>> +; FUNC-LABEL: {{^}}atomic_and_i32_ret_offset:
>>> +; SI: BUFFER_ATOMIC_AND [[RET:v[0-9]+]], s[{{[0-9]+}}:{{[0-9]+}}], 0 glc
>>> +; SI: BUFFER_STORE_DWORD [[RET]]
>>> +define void @atomic_and_i32_ret_offset(i32 addrspace(1)* %out, i32
>>> addrspace(1)* %out2, i32 %in) {
>>> +entry:
>>> +  %0  = atomicrmw volatile and i32 addrspace(1)* %out, i32 %in seq_cst
>>> +  store i32 %0, i32 addrspace(1)* %out2
>>> +  ret void
>>> +}
>>> +
>>> +; FUNC-LABEL: {{^}}atomic_and_i32_addr64:
>>> +; SI: BUFFER_ATOMIC_AND v{{[0-9]+}}, v[{{[0-9]+}}:{{[0-9]+}}],
>>> s[{{[0-9]+}}:{{[0-9]+}}], 0 addr64{{$}}
>>> +define void @atomic_and_i32_addr64(i32 addrspace(1)* %out, i32 %in, i64
>>> %index) {
>>> +entry:
>>> +  %ptr = getelementptr i32 addrspace(1)* %out, i64 %index
>>> +  %0  = atomicrmw volatile and i32 addrspace(1)* %ptr, i32 %in seq_cst
>>> +  ret void
>>> +}
>>> +
>>> +; FUNC-LABEL: {{^}}atomic_and_i32_ret_addr64:
>>> +; SI: BUFFER_ATOMIC_AND [[RET:v[0-9]+]], v[{{[0-9]+}}:{{[0-9]+}}],
>>> s[{{[0-9]+}}:{{[0-9]+}}], 0 addr64 glc{{$}}
>>> +; SI: BUFFER_STORE_DWORD [[RET]]
>>> +define void @atomic_and_i32_ret_addr64(i32 addrspace(1)* %out, i32
>>> addrspace(1)* %out2, i32 %in, i64 %index) {
>>> +entry:
>>> +  %ptr = getelementptr i32 addrspace(1)* %out, i64 %index
>>> +  %0  = atomicrmw volatile and i32 addrspace(1)* %ptr, i32 %in seq_cst
>>> +  store i32 %0, i32 addrspace(1)* %out2
>>> +  ret void
>>> +}
>>> +
>>>    ; FUNC-LABEL: {{^}}atomic_sub_i32_offset:
>>>    ; SI: BUFFER_ATOMIC_SUB v{{[0-9]+}}, s[{{[0-9]+}}:{{[0-9]+}}], 0{{$}}
>>>    define void @atomic_sub_i32_offset(i32 addrspace(1)* %out, i32 %in) {
>>