[PATCH] AMDGPU: don't match vgpr loads for constant loads

Matt Arsenault arsenm2 at gmail.com
Fri Jul 24 15:27:02 PDT 2015


> On Jul 23, 2015, at 6:44 PM, Dave Airlie <airlied at redhat.com> wrote:
> 
> In order to implement indirect sampler loads, we don't
> want to match on a VGPR load but an SGPR one for constants,
> as we cannot feed VGPRs to the sampler only SGPRs.
> 
> this should be applicable for llvm 3.7 as well.
> ---
> lib/Target/AMDGPU/SIInstructions.td       |  3 ---
> test/CodeGen/AMDGPU/gv-const-addrspace.ll | 12 +++---------
> test/CodeGen/AMDGPU/smrd.ll               |  8 +-------
> 3 files changed, 4 insertions(+), 19 deletions(-)
> 
> diff --git a/lib/Target/AMDGPU/SIInstructions.td b/lib/Target/AMDGPU/SIInstructions.td
> index de8675e..8c281a2 100644
> --- a/lib/Target/AMDGPU/SIInstructions.td
> +++ b/lib/Target/AMDGPU/SIInstructions.td
> @@ -2910,9 +2910,6 @@ defm : MUBUFLoad_Pattern <BUFFER_LOAD_SBYTE_ADDR64, i32, sextloadi8_constant>;
> defm : MUBUFLoad_Pattern <BUFFER_LOAD_UBYTE_ADDR64, i32, az_extloadi8_constant>;
> defm : MUBUFLoad_Pattern <BUFFER_LOAD_SSHORT_ADDR64, i32, sextloadi16_constant>;
> defm : MUBUFLoad_Pattern <BUFFER_LOAD_USHORT_ADDR64, i32, az_extloadi16_constant>;
> -defm : MUBUFLoad_Pattern <BUFFER_LOAD_DWORD_ADDR64, i32, constant_load>;
> -defm : MUBUFLoad_Pattern <BUFFER_LOAD_DWORDX2_ADDR64, v2i32, constant_load>;
> -defm : MUBUFLoad_Pattern <BUFFER_LOAD_DWORDX4_ADDR64, v4i32, constant_load>;
> } // End Predicates = [isSICI]
> 
> class MUBUFScratchLoadPat <MUBUF Instr, ValueType vt, PatFrag ld> : Pat <
> diff --git a/test/CodeGen/AMDGPU/gv-const-addrspace.ll b/test/CodeGen/AMDGPU/gv-const-addrspace.ll
> index 3c1fc6c..d4d1312 100644
> --- a/test/CodeGen/AMDGPU/gv-const-addrspace.ll
> +++ b/test/CodeGen/AMDGPU/gv-const-addrspace.ll
> @@ -8,9 +8,7 @@
> @float_gv = internal unnamed_addr addrspace(2) constant [5 x float] [float 0.0, float 1.0, float 2.0, float 3.0, float 4.0], align 4
> 
> ; FUNC-LABEL: {{^}}float:
> -; FIXME: We should be using s_load_dword here.
> -; SI: buffer_load_dword
> -; VI: s_load_dword
> +; GCN: s_load_dword
> 
> ; EG-DAG: MOV {{\** *}}T2.X
> ; EG-DAG: MOV {{\** *}}T3.X
> @@ -31,9 +29,7 @@ entry:
> 
> ; FUNC-LABEL: {{^}}i32:
> 
> -; FIXME: We should be using s_load_dword here.
> -; SI: buffer_load_dword
> -; VI: s_load_dword
> +; GCN: s_load_dword

Can you add a couple of test cases that use non-uniform indexing? One that is indexed by a llvm.r600.read.tidig.x() and another with an immediate offset off of that. These still need to work and use buffer_load_word


> 
> ; EG-DAG: MOV {{\** *}}T2.X
> ; EG-DAG: MOV {{\** *}}T3.X
> @@ -71,9 +67,7 @@ define void @struct_foo_gv_load(i32 addrspace(1)* %out, i32 %index) {
>                                                                 <1 x i32> <i32 4> ]
> 
> ; FUNC-LABEL: {{^}}array_v1_gv_load:
> -; FIXME: We should be using s_load_dword here.
> -; SI: buffer_load_dword
> -; VI: s_load_dword
> +; GCN: s_load_dword
> define void @array_v1_gv_load(<1 x i32> addrspace(1)* %out, i32 %index) {
>   %gep = getelementptr inbounds [4 x <1 x i32>], [4 x <1 x i32>] addrspace(2)* @array_v1_gv, i32 0, i32 %index
>   %load = load <1 x i32>, <1 x i32> addrspace(2)* %gep, align 4
> diff --git a/test/CodeGen/AMDGPU/smrd.ll b/test/CodeGen/AMDGPU/smrd.ll
> index b0c18ca..0598208 100644
> --- a/test/CodeGen/AMDGPU/smrd.ll
> +++ b/test/CodeGen/AMDGPU/smrd.ll
> @@ -43,13 +43,7 @@ entry:
> ; GCN-LABEL: {{^}}smrd3:
> ; FIXME: There are too many copies here because we don't fold immediates
> ;        through REG_SEQUENCE
> -; SI: s_mov_b32 s[[SLO:[0-9]+]], 0 ;
> -; SI: s_mov_b32 s[[SHI:[0-9]+]], 4
> -; SI: s_mov_b32 s[[SSLO:[0-9]+]], s[[SLO]]
> -; SI-DAG: v_mov_b32_e32 v[[VLO:[0-9]+]], s[[SSLO]]
> -; SI-DAG: v_mov_b32_e32 v[[VHI:[0-9]+]], s[[SHI]]
> -; FIXME: We should be able to use s_load_dword here
> -; SI: buffer_load_dword v{{[0-9]+}}, v{{\[}}[[VLO]]:[[VHI]]{{\]}}, s[{{[0-9]+:[0-9]+}}], 0 addr64
> +; SI: s_load_dwordx2 s[{{[0-9]:[0-9]}}], s[{{[0-9]:[0-9]}}], 0xb ; encoding: [0x0b
> ; TODO: Add VI checks
> ; GCN: s_endpgm
> define void @smrd3(i32 addrspace(1)* %out, i32 addrspace(2)* %ptr) {
> -- 
> 2.4.3
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu <mailto:llvm-commits at cs.uiuc.edu>
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits <http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150724/7f19669d/attachment.html>


More information about the llvm-commits mailing list