PATCHES: R600/SI: Enable machine scheduler
Matt Arsenault
Matthew.Arsenault at amd.com
Fri Dec 5 16:59:45 PST 2014
On 12/05/2014 07:38 PM, Tom Stellard wrote:
> On Tue, Dec 02, 2014 at 05:52:32PM -0500, Matt Arsenault wrote:
>> >
> Hi Matt,
>
> Here are some updated patches.
>
> -Tom
>
> 0001-R600-SI-Move-setting-of-the-lds-bit-to-the-base-MUBU.patch
>
>
> From 57edec3cb6061a6ec794c5db6937c30b8c669973 Mon Sep 17 00:00:00 2001
> From: Tom Stellard<thomas.stellard at amd.com>
> Date: Thu, 4 Dec 2014 15:21:31 +0000
> Subject: [PATCH 1/6] R600/SI: Move setting of the lds bit to the base MUBUF
> class
>
> ---
> lib/Target/R600/SIInstrFormats.td | 2 ++
> lib/Target/R600/SIInstrInfo.td | 7 +++----
> 2 files changed, 5 insertions(+), 4 deletions(-)
LGTM
>
> 0002-R600-SI-Set-MayStore-0-on-MUBUF-loads.patch
>
>
> From 7401e81c0113ad5c2928ef6d84f26500f604a967 Mon Sep 17 00:00:00 2001
> From: Tom Stellard<thomas.stellard at amd.com>
> Date: Mon, 24 Nov 2014 20:56:28 +0000
> Subject: [PATCH 2/6] R600/SI: Set MayStore = 0 on MUBUF loads
>
> ---
> lib/Target/R600/SIInstrInfo.td | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/lib/Target/R600/SIInstrInfo.td b/lib/Target/R600/SIInstrInfo.td
> index 5e479bc..23c4781 100644
> --- a/lib/Target/R600/SIInstrInfo.td
> +++ b/lib/Target/R600/SIInstrInfo.td
> @@ -1191,7 +1191,7 @@ multiclass MUBUF_Load_Helper <bits<7> op, string asm, RegisterClass regClass,
> ValueType load_vt = i32,
> SDPatternOperator ld = null_frag> {
>
> - let mayLoad = 1 in {
> + let mayLoad = 1, mayStore = 0 in {
>
> let addr64 = 0 in {
>
> -- 2.0.4
LGTM
>
> 0003-R600-SI-Be-more-conservative-about-inserting-s_waitc.patch
>
>
> From c0f83f4b60d4cb5bf8998297e1401ce142a9386d Mon Sep 17 00:00:00 2001
> From: Tom Stellard<thomas.stellard at amd.com>
> Date: Fri, 5 Dec 2014 22:10:28 +0000
> Subject: [PATCH 3/6] R600/SI: Be more conservative about inserting s_waitcnt
>
> Based on testing it seems that a ds_read instruction may
> overwrite the address of a previous ds_read. Here is an
> example:
>
> ds_read_b32 v18, v19
> ds_read_b32 v19, v20
>
> So we need to make sure we insert an s_waitcnt between these
> two instructions. This changes the SIInsertWait pass to
> handle this scenario for all instruction types. This is likely
> overly conservative, but more information is needed to implement
> this in an optimal way.
> ---
> lib/Target/R600/SIInsertWaits.cpp | 22 ++++++++++------------
> 1 file changed, 10 insertions(+), 12 deletions(-)
LGTM but I think the comment could be clearer.
>
> diff --git a/lib/Target/R600/SIInsertWaits.cpp b/lib/Target/R600/SIInsertWaits.cpp
> index 712d97d..5bdd8d7 100644
> --- a/lib/Target/R600/SIInsertWaits.cpp
> +++ b/lib/Target/R600/SIInsertWaits.cpp
> @@ -172,18 +172,16 @@ bool SIInsertWaits::isOpRelevant(MachineOperand &Op) {
> if (MI.getOpcode() == AMDGPU::EXP)
> return true;
>
> - // For stores the stored value is also relevant
> - if (!MI.getDesc().mayStore())
> - return false;
> -
> - for (MachineInstr::mop_iterator I = MI.operands_begin(),
> - E = MI.operands_end(); I != E; ++I) {
> -
> - if (I->isReg() && I->isUse())
> - return Op.isIdenticalTo(*I);
> - }
> -
> - return false;
> + // We used to not consider address operands as relevant. However testing
> + // has shown that a ds_read_b32 may overwrite the address of a previous
> + // ds_read_b32 instruction. Therefore, we need to consider address
> + // registers relevant.
To be more clear, how about:
However, testing has shown that a ds instruction may require its address
register to be preserved until the operation is completed. If the
address register is overwritten before the operation is complete, it may
not work correctly.
> + //
> + // FIXME: We are being conservative and doing this for all instruction
> + // types. Is this necessary.
Missing question mark
> + //
> + //
> + return Op.isReg() && Op.isUse();
> }
>
> RegInterval SIInsertWaits::getRegInterval(MachineOperand &Op) {
> -- 2.0.4
>
> 0004-R600-SI-Spill-VGPRs-to-scratch-space-for-compute-sha.patch
>
>
> From 218db2e2540f3391fc9af28948960e119bb6dcc8 Mon Sep 17 00:00:00 2001
> From: Tom Stellard<thomas.stellard at amd.com>
> Date: Thu, 27 Nov 2014 21:13:48 +0000
> Subject: [PATCH 4/6] R600/SI: Spill VGPRs to scratch space for compute shaders
>
> ---
> lib/Target/R600/AMDGPU.h | 1 +
> lib/Target/R600/AMDGPUTargetMachine.cpp | 1 +
> lib/Target/R600/CMakeLists.txt | 1 +
> lib/Target/R600/SIInstrInfo.cpp | 26 ++--
> lib/Target/R600/SIInstrInfo.td | 2 +
> lib/Target/R600/SIInstructions.td | 51 ++++----
> lib/Target/R600/SIMachineFunctionInfo.cpp | 3 +-
> lib/Target/R600/SIMachineFunctionInfo.h | 3 +
> lib/Target/R600/SIPrepareScratchRegs.cpp | 196 ++++++++++++++++++++++++++++++
> lib/Target/R600/SIRegisterInfo.cpp | 156 ++++++++++++++----------
> lib/Target/R600/SIRegisterInfo.h | 9 +-
> 11 files changed, 354 insertions(+), 95 deletions(-)
> create mode 100644 lib/Target/R600/SIPrepareScratchRegs.cpp
LGTM
> +
> + for (MachineBasicBlock::iterator I = MBB.begin(), E = MBB.end();
> + I != E; ++I) {
> + MachineInstr &MI = *I;
> + DebugLoc DL = MI.getDebugLoc();
> + switch(MI.getOpcode()) {
> + default: break;;
> + case AMDGPU::SI_SPILL_V512_SAVE:
> + case AMDGPU::SI_SPILL_V256_SAVE:
> + case AMDGPU::SI_SPILL_V128_SAVE:
> + case AMDGPU::SI_SPILL_V96_SAVE:
> + case AMDGPU::SI_SPILL_V64_SAVE:
> + case AMDGPU::SI_SPILL_V32_SAVE:
> + case AMDGPU::SI_SPILL_V32_RESTORE:
> + case AMDGPU::SI_SPILL_V64_RESTORE:
> + case AMDGPU::SI_SPILL_V128_RESTORE:
> + case AMDGPU::SI_SPILL_V256_RESTORE:
> + case AMDGPU::SI_SPILL_V512_RESTORE:
> +
Maybe these should get an isSpillSave bit?
> 0006-R600-SI-Define-a-schedule-model-and-enable-the-gener.patch
>
>
> From 0375ddd7acffe50c650454766c04229877f42908 Mon Sep 17 00:00:00 2001
> From: Tom Stellard<thomas.stellard at amd.com>
> Date: Fri, 19 Jul 2013 11:50:00 -0700
> Subject: [PATCH 6/6] R600/SI: Define a schedule model and enable the generic
> machine scheduler
>
> The schedule model is not complete yet, and could be improved.
> ---
> lib/Target/R600/AMDGPUSubtarget.cpp | 21 ++++++-
> lib/Target/R600/AMDGPUSubtarget.h | 13 +++-
> lib/Target/R600/Processors.td | 22 +++----
> lib/Target/R600/SIInstrFormats.td | 10 ++-
> lib/Target/R600/SIInstructions.td | 48 +++++++++++++-
> lib/Target/R600/SIRegisterInfo.cpp | 55 +++++++++++++++-
> lib/Target/R600/SIRegisterInfo.h | 12 +++-
> lib/Target/R600/SISchedule.td | 80 +++++++++++++++++++++++-
> test/CodeGen/R600/atomic_cmp_swap_local.ll | 6 +-
> test/CodeGen/R600/ctpop.ll | 4 +-
> test/CodeGen/R600/ds_read2st64.ll | 4 +-
> test/CodeGen/R600/fceil64.ll | 4 +-
> test/CodeGen/R600/ffloor.ll | 4 +-
> test/CodeGen/R600/fmax3.ll | 6 +-
> test/CodeGen/R600/fmin3.ll | 6 +-
> test/CodeGen/R600/fneg-fabs.f64.ll | 2 +-
> test/CodeGen/R600/ftrunc.f64.ll | 4 +-
> test/CodeGen/R600/llvm.memcpy.ll | 34 +++++-----
> test/CodeGen/R600/local-atomics.ll | 4 +-
> test/CodeGen/R600/local-atomics64.ll | 2 +-
> test/CodeGen/R600/local-memory-two-objects.ll | 4 +-
> test/CodeGen/R600/si-triv-disjoint-mem-access.ll | 2 +-
> test/CodeGen/R600/smrd.ll | 10 +--
> test/CodeGen/R600/valu-i1.ll | 3 +-
> test/CodeGen/R600/wait.ll | 5 +-
> test/CodeGen/R600/xor.ll | 2 +-
> test/CodeGen/R600/zero_extend.ll | 2 +-
> 27 files changed, 290 insertions(+), 79 deletions(-)
LGTM
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141205/f67d4133/attachment.html>
More information about the llvm-commits
mailing list