PATCHES: R600/SI: mubuf addressing improvements

Matt Arsenault Matthew.Arsenault at amd.com
Thu Jan 29 17:24:54 PST 2015


On 01/29/2015 04:59 PM, Tom Stellard wrote:
> Hi,
>
> The attached patches optimize buffer addressing for mubuf instructions.
> The main change is that we now use the soffset field for
> non-inline immediate offsets.
>
> -Tom
>
> 0001-R600-SI-Add-soffset-operand-to-mubuf-addr64-instruct.patch
>
>
>  From 4dc4d5e525ae9de0a14c28fd09291daed21b2411 Mon Sep 17 00:00:00 2001
> From: Tom Stellard<thomas.stellard at amd.com>
> Date: Tue, 6 Jan 2015 17:06:27 -0500
> Subject: [PATCH 1/4] R600/SI: Add soffset operand to mubuf addr64 instruction
>
> We were previously hard-coding soffset to 0.
> ---
>   lib/Target/R600/AMDGPUISelDAGToDAG.cpp | 15 +++++++-------
>   lib/Target/R600/SIISelLowering.cpp     |  1 +
>   lib/Target/R600/SIInstrInfo.cpp        |  5 ++---
>   lib/Target/R600/SIInstrInfo.td         | 36 +++++++++++++++++++---------------
>   lib/Target/R600/SIInstructions.td      |  4 ++--
>   5 files changed, 33 insertions(+), 28 deletions(-)
>
LGTM
>
> 0002-R600-SI-Store-immediate-offsets-12-bits-in-soffset.patch
>
>
>  From 701ba28cd225ce73e0baa1c9d9e8ada1c223102c Mon Sep 17 00:00:00 2001
> From: Tom Stellard<thomas.stellard at amd.com>
> Date: Thu, 15 Jan 2015 11:06:58 -0500
> Subject: [PATCH 2/4] R600/SI: Store immediate offsets > 12-bits in soffset
>
> This will save us from having to extend these offsets to 64-bits
> and storing them in a pair of vgprs.
> ---
>   lib/Target/R600/AMDGPUISelDAGToDAG.cpp | 34 ++++++++++++++++++++--------------
>   test/CodeGen/R600/mubuf.ll             |  9 ++++++---
>   2 files changed, 26 insertions(+), 17 deletions(-)
>

> -      return;
> +    }
> +
> +    if (isLegalMUBUFImmOffset(C1)) {
> +        Offset = CurDAG->getTargetConstant(C1->getZExtValue(), MVT::i16);
> +        return;
> +    } else if (isUInt<32>(C1->getZExtValue())) {
> +        // Illegal offset, store it in soffset.
> +        Offset = CurDAG->getTargetConstant(0, MVT::i16);
> +        SOffset = SDValue(CurDAG->getMachineNode(AMDGPU::S_MOV_B32, DL, MVT::i32,
> +                    CurDAG->getTargetConstant(C1->getZExtValue(), MVT::i32)),0);
> +        return;
>       }

LGTM, except this looks like this isn't 2 space indent

>
> 0003-R600-SI-Split-large-MUBUF-immediate-offsets-so-part-.patch
>
>
>  From 27603c79ab1259306ad4df7f08bdd83330d78590 Mon Sep 17 00:00:00 2001
> From: Tom Stellard<thomas.stellard at amd.com>
> Date: Wed, 14 Jan 2015 16:21:56 -0500
> Subject: [PATCH 3/4] R600/SI: Split large MUBUF immediate offsets so part can
>   be stored in the offset field
>
> For example, if we have two loads that use offsets of 4096 and 5000,
> we can store 4096 in the vaddr register for both and then store the rest
> of the offset in the offset field.  For example:
>
>    v_movk_i32 s4, 0x1000
>    buffer_load_dword v2, s[0:3], s4
>    buffer_load_dword v0, s[0:3], s4 offset:4
>
> This reduces code size by eliminating an extra immediate move and it
> also improves load/store clustering by making it easier to detect that
> these two loads have a common base:
> ---
>   lib/Target/R600/AMDGPUISelDAGToDAG.cpp | 32 +++++++++++++++++++++++++-------
>   test/CodeGen/R600/mubuf.ll             | 20 ++++++++++++++++++++
>   2 files changed, 45 insertions(+), 7 deletions(-)
>
LGTM

>       } else if (isUInt<32>(C1->getZExtValue())) {
> -        // Illegal offset, store it in soffset.
> -        Offset = CurDAG->getTargetConstant(0, MVT::i16);
> -        SOffset = SDValue(CurDAG->getMachineNode(AMDGPU::S_MOV_B32, DL, MVT::i32,
> -                    CurDAG->getTargetConstant(C1->getZExtValue(), MVT::i32)),0);
> -        return;
> +      unsigned LegalPart, IllegalPart;
> +      splitIllegalImmOffset(C1->getZExtValue(), getMUBUFImmOffsetWidth(),
> +                            LegalPart, IllegalPart);
> +      // Illegal offset, store it in soffset.
> +      Offset = CurDAG->getTargetConstant(LegalPart, MVT::i16);
> +      SOffset = SDValue(CurDAG->getMachineNode(AMDGPU::S_MOV_B32, DL, MVT::i32,
> +                  CurDAG->getTargetConstant(IllegalPart, MVT::i32)),0);
Missing space after ,
> +      return;
>       }
>     }
>   
> diff --git a/test/CodeGen/R600/mubuf.ll b/test/CodeGen/R600/mubuf.ll
> index c8a3105..3ad4ba4 100644
> --- a/test/CodeGen/R600/mubuf.ll
> +++ b/test/CodeGen/R600/mubuf.ll
> @@ -92,6 +92,26 @@ declare void @llvm.SI.tbuffer.store.i32(<16 x i8>, i32, i32, i32, i32, i32, i32,
>   attributes #1 = { "ShaderType"="2" "unsafe-fp-math"="true" }
>   attributes #3 = { nounwind readonly }
>   
> +; Check that multiple loads with immediate offsets that are too big still use
> +; the immediate field for part of the offset
> +
> +; CHECK-LABEL: {{^}}mubuf_multiple_max_offset:
> +; FIXME: The offset should be folded into VGPRs.
> +; CHECK: s_movk_i32 [[SOFFSET:s[0-9]+]], 0x1000
> +; CHECK: buffer_load_dword v{{[0-9]+}}, s[{{[0-9]+:[0-9]+}}], [[SOFFSET]] ;
> +; CHECK: buffer_load_dword v{{[0-9]+}}, s[{{[0-9]+:[0-9]+}}], [[SOFFSET]] offset:4 ;
> +
> +define void @mubuf_multiple_max_offset(i32 addrspace(1)* %out, i32 addrspace(1)* %in) {
> +entry:
> +  %0 = getelementptr i32 addrspace(1)* %in, i64 1024
> +  %1 = load i32 addrspace(1)* %0
> +  %2 = getelementptr i32 addrspace(1)* %in, i64 1025
> +  %3 = load i32 addrspace(1)* %2
> +  %4 = add i32 %1, %3
> +  store i32 %4, i32 addrspace(1)* %out
> +  ret void
> +}
> +
Test functions should go before the attribute groups

>
> 0004-R600-SI-Optimize-offsets-for-the-SI.buffer.load.dwor.patch
>
>
>  From ce11a756b35d6b45ac9d653509f056cf31e84845 Mon Sep 17 00:00:00 2001
> From: Tom Stellard<thomas.stellard at amd.com>
> Date: Thu, 15 Jan 2015 12:00:21 -0500
> Subject: [PATCH 4/4] R600/SI: Optimize offsets for the SI.buffer.load.dword
>   intrinsics.
>
> In the cases when an immediate offset is stored in the intrinsic's
> soffset field, we can split off the legal part of the offset and
> store it in the MUBUF's offset field.
>
> This MUBUF instructions using similar addresses can use the same
> soffset value, which reduces register usage.
> ---
>   lib/Target/R600/AMDGPUISelDAGToDAG.cpp | 44 +++++++++++++++++++++-------------
>   lib/Target/R600/SIInstrInfo.td         |  1 +
>   lib/Target/R600/SIInstructions.td      |  9 +++++++
>   test/CodeGen/R600/mubuf.ll             | 23 ++++++++++++++----
>   4 files changed, 57 insertions(+), 20 deletions(-)
>

Should getLdStBaseRegImmOfs be updated to account for the immediate in 
soffset?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150129/63d1fe2f/attachment.html>


More information about the llvm-commits mailing list