PATCHES: R600/SI: Enable machine scheduler

Matt Arsenault Matthew.Arsenault at amd.com
Fri Dec 5 16:59:45 PST 2014


On 12/05/2014 07:38 PM, Tom Stellard wrote:
> On Tue, Dec 02, 2014 at 05:52:32PM -0500, Matt Arsenault wrote:
>> >
> Hi Matt,
>
> Here are some updated patches.
>
> -Tom
>
> 0001-R600-SI-Move-setting-of-the-lds-bit-to-the-base-MUBU.patch
>
>
>  From 57edec3cb6061a6ec794c5db6937c30b8c669973 Mon Sep 17 00:00:00 2001
> From: Tom Stellard<thomas.stellard at amd.com>
> Date: Thu, 4 Dec 2014 15:21:31 +0000
> Subject: [PATCH 1/6] R600/SI: Move setting of the lds bit to the base MUBUF
>   class
>
> ---
>   lib/Target/R600/SIInstrFormats.td | 2 ++
>   lib/Target/R600/SIInstrInfo.td    | 7 +++----
>   2 files changed, 5 insertions(+), 4 deletions(-)
LGTM
>
> 0002-R600-SI-Set-MayStore-0-on-MUBUF-loads.patch
>
>
>  From 7401e81c0113ad5c2928ef6d84f26500f604a967 Mon Sep 17 00:00:00 2001
> From: Tom Stellard<thomas.stellard at amd.com>
> Date: Mon, 24 Nov 2014 20:56:28 +0000
> Subject: [PATCH 2/6] R600/SI: Set MayStore = 0 on MUBUF loads
>
> ---
>   lib/Target/R600/SIInstrInfo.td | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/lib/Target/R600/SIInstrInfo.td b/lib/Target/R600/SIInstrInfo.td
> index 5e479bc..23c4781 100644
> --- a/lib/Target/R600/SIInstrInfo.td
> +++ b/lib/Target/R600/SIInstrInfo.td
> @@ -1191,7 +1191,7 @@ multiclass MUBUF_Load_Helper <bits<7> op, string asm, RegisterClass regClass,
>                                 ValueType load_vt = i32,
>                                 SDPatternOperator ld = null_frag> {
>   
> -  let mayLoad = 1 in {
> +  let mayLoad = 1, mayStore = 0 in {
>   
>       let addr64 = 0 in {
>   
> -- 2.0.4

LGTM
>
> 0003-R600-SI-Be-more-conservative-about-inserting-s_waitc.patch
>
>
>  From c0f83f4b60d4cb5bf8998297e1401ce142a9386d Mon Sep 17 00:00:00 2001
> From: Tom Stellard<thomas.stellard at amd.com>
> Date: Fri, 5 Dec 2014 22:10:28 +0000
> Subject: [PATCH 3/6] R600/SI: Be more conservative about inserting s_waitcnt
>
> Based on testing it seems that a ds_read instruction may
> overwrite the address of a previous ds_read.  Here is an
> example:
>
> ds_read_b32 v18, v19
> ds_read_b32 v19, v20
>
> So we need to make sure we insert an s_waitcnt between these
> two instructions.  This changes the SIInsertWait pass to
> handle this scenario for all instruction types.  This is likely
> overly conservative, but more information is needed to implement
> this in an optimal way.
> ---
>   lib/Target/R600/SIInsertWaits.cpp | 22 ++++++++++------------
>   1 file changed, 10 insertions(+), 12 deletions(-)

LGTM but I think the comment could be clearer.
>
> diff --git a/lib/Target/R600/SIInsertWaits.cpp b/lib/Target/R600/SIInsertWaits.cpp
> index 712d97d..5bdd8d7 100644
> --- a/lib/Target/R600/SIInsertWaits.cpp
> +++ b/lib/Target/R600/SIInsertWaits.cpp
> @@ -172,18 +172,16 @@ bool SIInsertWaits::isOpRelevant(MachineOperand &Op) {
>     if (MI.getOpcode() == AMDGPU::EXP)
>       return true;
>   
> -  // For stores the stored value is also relevant
> -  if (!MI.getDesc().mayStore())
> -    return false;
> -
> -  for (MachineInstr::mop_iterator I = MI.operands_begin(),
> -       E = MI.operands_end(); I != E; ++I) {
> -
> -    if (I->isReg() && I->isUse())
> -      return Op.isIdenticalTo(*I);
> -  }
> -
> -  return false;
> +  // We used to not consider address operands as relevant.  However testing
> +  // has shown that a ds_read_b32 may overwrite the address of a previous
> +  // ds_read_b32 instruction.  Therefore, we need to consider address
> +  // registers relevant.
To be more clear, how about:

However, testing has shown that a ds instruction may require its address 
register to be preserved until the operation is completed. If the 
address register is overwritten before the operation is complete, it may 
not work correctly.

> +  //
> +  // FIXME: We are being conservative and doing this for all instruction
> +  // types.  Is this necessary.
Missing question mark
> +  //
> +  //
> +  return Op.isReg() && Op.isUse();
>   }
>   
>   RegInterval SIInsertWaits::getRegInterval(MachineOperand &Op) {
> -- 2.0.4
>
> 0004-R600-SI-Spill-VGPRs-to-scratch-space-for-compute-sha.patch
>
>
>  From 218db2e2540f3391fc9af28948960e119bb6dcc8 Mon Sep 17 00:00:00 2001
> From: Tom Stellard<thomas.stellard at amd.com>
> Date: Thu, 27 Nov 2014 21:13:48 +0000
> Subject: [PATCH 4/6] R600/SI: Spill VGPRs to scratch space for compute shaders
>
> ---
>   lib/Target/R600/AMDGPU.h                  |   1 +
>   lib/Target/R600/AMDGPUTargetMachine.cpp   |   1 +
>   lib/Target/R600/CMakeLists.txt            |   1 +
>   lib/Target/R600/SIInstrInfo.cpp           |  26 ++--
>   lib/Target/R600/SIInstrInfo.td            |   2 +
>   lib/Target/R600/SIInstructions.td         |  51 ++++----
>   lib/Target/R600/SIMachineFunctionInfo.cpp |   3 +-
>   lib/Target/R600/SIMachineFunctionInfo.h   |   3 +
>   lib/Target/R600/SIPrepareScratchRegs.cpp  | 196 ++++++++++++++++++++++++++++++
>   lib/Target/R600/SIRegisterInfo.cpp        | 156 ++++++++++++++----------
>   lib/Target/R600/SIRegisterInfo.h          |   9 +-
>   11 files changed, 354 insertions(+), 95 deletions(-)
>   create mode 100644 lib/Target/R600/SIPrepareScratchRegs.cpp

LGTM
> +
> +    for (MachineBasicBlock::iterator I = MBB.begin(), E = MBB.end();
> +         I != E; ++I) {
> +      MachineInstr &MI = *I;
> +      DebugLoc DL = MI.getDebugLoc();
> +      switch(MI.getOpcode()) {
> +        default: break;;
> +        case AMDGPU::SI_SPILL_V512_SAVE:
> +        case AMDGPU::SI_SPILL_V256_SAVE:
> +        case AMDGPU::SI_SPILL_V128_SAVE:
> +        case AMDGPU::SI_SPILL_V96_SAVE:
> +        case AMDGPU::SI_SPILL_V64_SAVE:
> +        case AMDGPU::SI_SPILL_V32_SAVE:
> +        case AMDGPU::SI_SPILL_V32_RESTORE:
> +        case AMDGPU::SI_SPILL_V64_RESTORE:
> +        case AMDGPU::SI_SPILL_V128_RESTORE:
> +        case AMDGPU::SI_SPILL_V256_RESTORE:
> +        case AMDGPU::SI_SPILL_V512_RESTORE:
> +

Maybe these should get an isSpillSave bit?

> 0006-R600-SI-Define-a-schedule-model-and-enable-the-gener.patch
>
>
>  From 0375ddd7acffe50c650454766c04229877f42908 Mon Sep 17 00:00:00 2001
> From: Tom Stellard<thomas.stellard at amd.com>
> Date: Fri, 19 Jul 2013 11:50:00 -0700
> Subject: [PATCH 6/6] R600/SI: Define a schedule model and enable the generic
>   machine scheduler
>
> The schedule model is not complete yet, and could be improved.
> ---
>   lib/Target/R600/AMDGPUSubtarget.cpp              | 21 ++++++-
>   lib/Target/R600/AMDGPUSubtarget.h                | 13 +++-
>   lib/Target/R600/Processors.td                    | 22 +++----
>   lib/Target/R600/SIInstrFormats.td                | 10 ++-
>   lib/Target/R600/SIInstructions.td                | 48 +++++++++++++-
>   lib/Target/R600/SIRegisterInfo.cpp               | 55 +++++++++++++++-
>   lib/Target/R600/SIRegisterInfo.h                 | 12 +++-
>   lib/Target/R600/SISchedule.td                    | 80 +++++++++++++++++++++++-
>   test/CodeGen/R600/atomic_cmp_swap_local.ll       |  6 +-
>   test/CodeGen/R600/ctpop.ll                       |  4 +-
>   test/CodeGen/R600/ds_read2st64.ll                |  4 +-
>   test/CodeGen/R600/fceil64.ll                     |  4 +-
>   test/CodeGen/R600/ffloor.ll                      |  4 +-
>   test/CodeGen/R600/fmax3.ll                       |  6 +-
>   test/CodeGen/R600/fmin3.ll                       |  6 +-
>   test/CodeGen/R600/fneg-fabs.f64.ll               |  2 +-
>   test/CodeGen/R600/ftrunc.f64.ll                  |  4 +-
>   test/CodeGen/R600/llvm.memcpy.ll                 | 34 +++++-----
>   test/CodeGen/R600/local-atomics.ll               |  4 +-
>   test/CodeGen/R600/local-atomics64.ll             |  2 +-
>   test/CodeGen/R600/local-memory-two-objects.ll    |  4 +-
>   test/CodeGen/R600/si-triv-disjoint-mem-access.ll |  2 +-
>   test/CodeGen/R600/smrd.ll                        | 10 +--
>   test/CodeGen/R600/valu-i1.ll                     |  3 +-
>   test/CodeGen/R600/wait.ll                        |  5 +-
>   test/CodeGen/R600/xor.ll                         |  2 +-
>   test/CodeGen/R600/zero_extend.ll                 |  2 +-
>   27 files changed, 290 insertions(+), 79 deletions(-)

LGTM

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141205/f67d4133/attachment.html>


More information about the llvm-commits mailing list