[llvm] r371671 - AMDGPU: Move m0 initializations earlier

Galina Kistanova via llvm-commits llvm-commits at lists.llvm.org
Thu Sep 12 11:11:30 PDT 2019


Hello Austin,

It looks like your commit added more broken tests to the builder:
http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/19619
. . .
Failing Tests (71):
    LLVM :: CodeGen/AMDGPU/amdgpu.private-memory.ll
    LLVM :: CodeGen/AMDGPU/array-ptr-calc-i32.ll
    LLVM :: CodeGen/AMDGPU/atomic_load_local.ll
    LLVM :: CodeGen/AMDGPU/atomic_store_local.ll
    LLVM :: CodeGen/AMDGPU/captured-frame-index.ll
    LLVM :: CodeGen/AMDGPU/cf-loop-on-constant.ll
    LLVM :: CodeGen/AMDGPU/cgp-addressing-modes.ll
    LLVM :: CodeGen/AMDGPU/drop-mem-operand-move-smrd.ll
    LLVM :: CodeGen/AMDGPU/ds-combine-large-stride.ll
    LLVM :: CodeGen/AMDGPU/ds-sub-offset.ll
    LLVM :: CodeGen/AMDGPU/ds_read2.ll
    LLVM :: CodeGen/AMDGPU/ds_read2_superreg.ll
    LLVM :: CodeGen/AMDGPU/ds_read2st64.ll
    LLVM :: CodeGen/AMDGPU/ds_write2.ll
    LLVM :: CodeGen/AMDGPU/ds_write2st64.ll
    LLVM :: CodeGen/AMDGPU/extload.ll
    LLVM :: CodeGen/AMDGPU/fence-barrier.ll
    LLVM :: CodeGen/AMDGPU/flat-for-global-subtarget-feature.ll
    LLVM :: CodeGen/AMDGPU/frame-index-elimination.ll
    LLVM :: CodeGen/AMDGPU/function-args.ll
    LLVM :: CodeGen/AMDGPU/function-returns.ll
    LLVM :: CodeGen/AMDGPU/gep-address-space.ll
    LLVM :: CodeGen/AMDGPU/hsa-group-segment.ll
    LLVM :: CodeGen/AMDGPU/indirect-addressing-si.ll
    LLVM :: CodeGen/AMDGPU/indirect-private-64.ll
    LLVM :: CodeGen/AMDGPU/insert-subvector-unused-scratch.ll
    LLVM :: CodeGen/AMDGPU/lds-alignment.ll
    LLVM :: CodeGen/AMDGPU/lds-bounds.ll
    LLVM :: CodeGen/AMDGPU/llvm.amdgcn.atomic.inc.ll
    LLVM :: CodeGen/AMDGPU/llvm.amdgcn.buffer.load.ll
    LLVM :: CodeGen/AMDGPU/llvm.amdgcn.ds.gws.barrier.ll
    LLVM :: CodeGen/AMDGPU/llvm.amdgcn.image.dim.ll
    LLVM :: CodeGen/AMDGPU/llvm.amdgcn.raw.buffer.load.ll
    LLVM :: CodeGen/AMDGPU/llvm.amdgcn.struct.buffer.load.ll
    LLVM :: CodeGen/AMDGPU/llvm.memcpy.ll
    LLVM :: CodeGen/AMDGPU/load-hi16.ll
    LLVM :: CodeGen/AMDGPU/load-lo16.ll
    LLVM :: CodeGen/AMDGPU/load-local-f32-no-ds128.ll
    LLVM :: CodeGen/AMDGPU/load-local-f32.ll
    LLVM :: CodeGen/AMDGPU/load-local-f64.ll
    LLVM :: CodeGen/AMDGPU/load-local-i1.ll
    LLVM :: CodeGen/AMDGPU/load-local-i16.ll
    LLVM :: CodeGen/AMDGPU/load-local-i32.ll
    LLVM :: CodeGen/AMDGPU/load-local-i64.ll
    LLVM :: CodeGen/AMDGPU/load-local-i8.ll
    LLVM :: CodeGen/AMDGPU/load-select-ptr.ll
    LLVM :: CodeGen/AMDGPU/local-64.ll
    LLVM :: CodeGen/AMDGPU/local-atomics-fp.ll
    LLVM :: CodeGen/AMDGPU/local-memory.amdgcn.ll
    LLVM :: CodeGen/AMDGPU/local-memory.ll
    LLVM :: CodeGen/AMDGPU/merge-m0.mir
    LLVM :: CodeGen/AMDGPU/merge-stores.ll
    LLVM :: CodeGen/AMDGPU/nested-loop-conditions.ll
    LLVM :: CodeGen/AMDGPU/phi-elimination-end-cf.mir
    LLVM :: CodeGen/AMDGPU/private-memory-atomics.ll
    LLVM :: CodeGen/AMDGPU/promote-alloca-globals.ll
    LLVM :: CodeGen/AMDGPU/promote-alloca-no-opts.ll
    LLVM :: CodeGen/AMDGPU/reduce-store-width-alignment.ll
    LLVM :: CodeGen/AMDGPU/reorder-stores.ll
    LLVM :: CodeGen/AMDGPU/schedule-ilp.ll
    LLVM :: CodeGen/AMDGPU/schedule-regpressure-limit.ll
    LLVM :: CodeGen/AMDGPU/schedule-regpressure-limit2.ll
    LLVM :: CodeGen/AMDGPU/schedule-regpressure-limit3.ll
    LLVM :: CodeGen/AMDGPU/shl_add_ptr.ll
    LLVM :: CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll
    LLVM :: CodeGen/AMDGPU/store-barrier.ll
    LLVM :: CodeGen/AMDGPU/store-local.ll
    LLVM :: CodeGen/AMDGPU/store-v3i64.ll
    LLVM :: CodeGen/AMDGPU/store-weird-sizes.ll
    LLVM :: CodeGen/AMDGPU/unaligned-load-store.ll
    LLVM :: CodeGen/AMDGPU/vectorize-global-local.ll
    . . .

The builder was already red and did not send notifications on this.
Please have a look ASAP?

Thanks

Galina

On Wed, Sep 11, 2019 at 2:26 PM Austin Kerbow via llvm-commits <
llvm-commits at lists.llvm.org> wrote:

> Author: kerbowa
> Date: Wed Sep 11 14:28:41 2019
> New Revision: 371671
>
> URL: http://llvm.org/viewvc/llvm-project?rev=371671&view=rev
> Log:
> AMDGPU: Move m0 initializations earlier
>
> Summary:
> After hoisting and merging m0 initializations schedule them as early as
> possible in the MBB. This helps the scheduler avoid hazards in some
> cases.
>
> Reviewers: rampitec, arsenm
>
> Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr,
> t-tye, hiraditya, arphaman, llvm-commits
>
> Tags: #llvm
>
> Differential Revision: https://reviews.llvm.org/D67450
>
> Modified:
>     llvm/trunk/lib/Target/AMDGPU/SIFixSGPRCopies.cpp
>     llvm/trunk/test/CodeGen/AMDGPU/frame-index-elimination.ll
>     llvm/trunk/test/CodeGen/AMDGPU/merge-m0.mir
>
> Modified: llvm/trunk/lib/Target/AMDGPU/SIFixSGPRCopies.cpp
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/SIFixSGPRCopies.cpp?rev=371671&r1=371670&r2=371671&view=diff
>
> ==============================================================================
> --- llvm/trunk/lib/Target/AMDGPU/SIFixSGPRCopies.cpp (original)
> +++ llvm/trunk/lib/Target/AMDGPU/SIFixSGPRCopies.cpp Wed Sep 11 14:28:41
> 2019
> @@ -466,6 +466,7 @@ getFirstNonPrologue(MachineBasicBlock *M
>  // executioon.
>  static bool hoistAndMergeSGPRInits(unsigned Reg,
>                                     const MachineRegisterInfo &MRI,
> +                                   const TargetRegisterInfo *TRI,
>                                     MachineDominatorTree &MDT,
>                                     const TargetInstrInfo *TII) {
>    // List of inits by immediate value.
> @@ -480,7 +481,7 @@ static bool hoistAndMergeSGPRInits(unsig
>
>    for (auto &MI : MRI.def_instructions(Reg)) {
>      MachineOperand *Imm = nullptr;
> -    for (auto &MO: MI.operands()) {
> +    for (auto &MO : MI.operands()) {
>        if ((MO.isReg() && ((MO.isDef() && MO.getReg() != Reg) ||
> !MO.isDef())) ||
>            (!MO.isImm() && !MO.isReg()) || (MO.isImm() && Imm)) {
>          Imm = nullptr;
> @@ -585,8 +586,41 @@ static bool hoistAndMergeSGPRInits(unsig
>      }
>    }
>
> -  for (auto MI : MergedInstrs)
> -    MI->removeFromParent();
> +  // Remove initializations that were merged into another.
> +  for (auto &Init : Inits) {
> +    auto &Defs = Init.second;
> +    for (auto I = Defs.begin(); I != Defs.end(); ++I)
> +      if (MergedInstrs.count(*I)) {
> +        (*I)->eraseFromParent();
> +        I = Defs.erase(I);
> +      }
> +  }
> +
> +  // Try to schedule SGPR initializations as early as possible in the MBB.
> +  for (auto &Init : Inits) {
> +    auto &Defs = Init.second;
> +    for (auto MI : Defs) {
> +      auto MBB = MI->getParent();
> +      MachineInstr &BoundaryMI = *getFirstNonPrologue(MBB, TII);
> +      MachineBasicBlock::reverse_iterator B(BoundaryMI);
> +      // Check if B should actually be a bondary. If not set the previous
> +      // instruction as the boundary instead.
> +      if (!TII->isBasicBlockPrologue(*B))
> +        B++;
> +
> +      auto R = std::next(MI->getReverseIterator());
> +      const unsigned Threshold = 50;
> +      // Search until B or Threashold for a place to insert the
> initialization.
> +      for (unsigned I = 0; R != B && I < Threshold; ++R, ++I)
> +        if (R->readsRegister(Reg, TRI) || R->definesRegister(Reg, TRI) ||
> +            TII->isSchedulingBoundary(*R, MBB, *MBB->getParent()))
> +          break;
> +
> +      // Move to directly after R.
> +      if (&*--R != MI)
> +        MBB->splice(*R, MBB, MI);
> +    }
> +  }
>
>    if (Changed)
>      MRI.clearKillFlags(Reg);
> @@ -755,7 +789,7 @@ bool SIFixSGPRCopies::runOnMachineFuncti
>    }
>
>    if (MF.getTarget().getOptLevel() > CodeGenOpt::None && EnableM0Merge)
> -    hoistAndMergeSGPRInits(AMDGPU::M0, MRI, *MDT, TII);
> +    hoistAndMergeSGPRInits(AMDGPU::M0, MRI, TRI, *MDT, TII);
>
>    return true;
>  }
>
> Modified: llvm/trunk/test/CodeGen/AMDGPU/frame-index-elimination.ll
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/frame-index-elimination.ll?rev=371671&r1=371670&r2=371671&view=diff
>
> ==============================================================================
> --- llvm/trunk/test/CodeGen/AMDGPU/frame-index-elimination.ll (original)
> +++ llvm/trunk/test/CodeGen/AMDGPU/frame-index-elimination.ll Wed Sep 11
> 14:28:41 2019
> @@ -26,12 +26,12 @@ define void @func_mov_fi_i32() #0 {
>
>  ; CI: s_sub_u32 [[SUB0:s[0-9]+|vcc_lo|vcc_hi]], s32, s33
>  ; CI-NEXT: s_sub_u32 [[SUB1:s[0-9]+|vcc_lo|vcc_hi]], s32, s33
> -; CI-NEXT: v_lshr_b32_e64 [[SCALED:v[0-9]+]], [[SUB1]], 6
> -; CI-NEXT: v_lshr_b32_e64 v0, [[SUB0]], 6
> -; CI-NEXT: v_add_i32_e64 v1, s{{\[[0-9]+:[0-9]+\]}}, 4, [[SCALED]]
> +; CI-DAG: v_lshr_b32_e64 v0, [[SUB0]], 6
> +; CI-DAG: v_lshr_b32_e64 [[SCALED:v[0-9]+]], [[SUB1]], 6
>  ; CI-NOT: v_mov
>  ; CI: ds_write_b32 v0, v0
> -; CI-NEXT: ds_write_b32 v0, v1
> +; CI-NEXT: v_add_i32_e64 v0, s{{\[[0-9]+:[0-9]+\]}}, 4, [[SCALED]]
> +; CI-NEXT: ds_write_b32 v0, v0
>
>  ; GFX9: s_sub_u32 [[SUB0:s[0-9]+|vcc_lo|vcc_hi]], s32, s33
>  ; GFX9-NEXT: s_sub_u32 [[SUB1:s[0-9]+|vcc_lo|vcc_hi]], s32, s33
>
> Modified: llvm/trunk/test/CodeGen/AMDGPU/merge-m0.mir
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/merge-m0.mir?rev=371671&r1=371670&r2=371671&view=diff
>
> ==============================================================================
> --- llvm/trunk/test/CodeGen/AMDGPU/merge-m0.mir (original)
> +++ llvm/trunk/test/CodeGen/AMDGPU/merge-m0.mir Wed Sep 11 14:28:41 2019
> @@ -1,7 +1,10 @@
>  # RUN: llc -march=amdgcn -amdgpu-enable-merge-m0 -verify-machineinstrs
> -run-pass si-fix-sgpr-copies %s -o - | FileCheck -check-prefix=GCN %s
>
> +# GCN-LABEL: name: merge-m0-many-init
>  # GCN:    bb.0.entry:
>  # GCN:      SI_INIT_M0 -1
> +# GCN-NEXT: IMPLICIT_DEF
> +# GCN-NEXT: IMPLICIT_DEF
>  # GCN-NEXT: DS_WRITE_B32
>  # GCN-NEXT: DS_WRITE_B32
>  # GCN-NEXT: SI_INIT_M0 65536
> @@ -45,9 +48,8 @@
>  # GCN-NEXT: DS_WRITE_B32
>  # GCN-NEXT: SI_INIT_M0 -1
>  # GCN-NEXT: DS_WRITE_B32
> -
>  ---
> -name:            merge-m0-many-init
> +name: merge-m0-many-init
>  registers:
>    - { id: 0, class: vgpr_32 }
>    - { id: 1, class: vgpr_32 }
> @@ -124,22 +126,24 @@ body:             |
>
>  ...
>
> +# GCN-LABEL: name:
> merge-m0-dont-hoist-past-init-with-different-initializer
>  # GCN:    bb.0.entry:
>  # GCN:      SI_INIT_M0 65536
> +# GCN-NEXT: IMPLICIT_DEF
> +# GCN-NEXT: IMPLICIT_DEF
>  # GCN-NEXT: DS_WRITE_B32
>
> -#GCN:     bb.1:
> -#GCN-NOT:   SI_INIT_M0 65536
> -#GCN-NOT:   SI_INIT_M0 -1
> -
> -#GCN:     bb.2:
> -#GCN:       SI_INIT_M0 -1
> +# GCN:    bb.1:
> +# GCN-NOT:  SI_INIT_M0 65536
> +# GCN-NOT:  SI_INIT_M0 -1
>
> -#GCN:     bb.3:
> -#GCN:       SI_INIT_M0 -1
> +# GCN:    bb.2:
> +# GCN:      SI_INIT_M0 -1
>
> +# GCN:    bb.3:
> +# GCN:      SI_INIT_M0 -1
>  ---
> -name:            merge-m0-dont-hoist-past-init-with-different-initializer
> +name: merge-m0-dont-hoist-past-init-with-different-initializer
>  registers:
>    - { id: 0, class: vgpr_32 }
>    - { id: 1, class: vgpr_32 }
> @@ -179,19 +183,19 @@ body:             |
>      S_ENDPGM 0
>  ...
>
> +# GCN-LABEL: name: merge-m0-after-prologue
>  # GCN:    bb.0.entry:
>  # GCN-NOT:  SI_INIT_M0
>  # GCN:      S_OR_B64
>  # GCN-NEXT: SI_INIT_M0
>
> -#GCN:     bb.1:
> -#GCN-NOT:   SI_INIT_M0 -1
> -
> -#GCN:     bb.2:
> -#GCN-NOT:   SI_INIT_MO -1
> +# GCN:     bb.1:
> +# GCN-NOT:   SI_INIT_M0 -1
>
> +# GCN:     bb.2:
> +# GCN-NOT:   SI_INIT_MO -1
>  ---
> -name:            merge-m0-after-prologue
> +name: merge-m0-after-prologue
>  registers:
>    - { id: 0, class: vgpr_32 }
>    - { id: 1, class: vgpr_32 }
> @@ -223,3 +227,71 @@ body:             |
>    bb.3:
>      S_ENDPGM 0
>  ...
> +
> +# GCN-LABEL: name: move-m0-avoid-hazard
> +# GCN: $m0 = S_MOV_B32 -1
> +# GCN-NEXT: $vgpr0 = V_MOV_B32_e32 0, implicit $exec
> +# GCN-NEXT: DS_GWS_INIT $vgpr0, 0, 1, implicit $m0, implicit $exec
> +---
> +name: move-m0-avoid-hazard
> +body:             |
> +  bb.0:
> +    $vgpr0 = V_MOV_B32_e32 0, implicit $exec
> +    $m0 = S_MOV_B32 -1
> +    DS_GWS_INIT $vgpr0, 0, 1, implicit $m0, implicit $exec
> +...
> +
> +# GCN-LABEL: name: move-m0-with-prologue
> +# GCN $exec = S_OR_B64 $exec, killed $sgpr0_sgpr1, implicit-def $scc
> +# GCN: $m0 = S_MOV_B32 -1
> +# GCN-NEXT: $vgpr0 = V_MOV_B32_e32 0, implicit $exec
> +# GCN-NEXT: DS_GWS_INIT $vgpr0, 0, 1, implicit $m0, implicit $exec
> +---
> +name: move-m0-with-prologue
> +body:             |
> +  bb.0:
> +    liveins: $sgpr0_sgpr1
> +
> +    $exec = S_OR_B64 $exec, killed $sgpr0_sgpr1, implicit-def $scc
> +    $vgpr0 = V_MOV_B32_e32 0, implicit $exec
> +    $m0 = S_MOV_B32 -1
> +    DS_GWS_INIT $vgpr0, 0, 1, implicit $m0, implicit $exec
> +...
> +
> +# GCN-LABEL: name: move-m0-different-initializer
> +# GCN: SI_INIT_M0 -1
> +# GCN-NEXT: %0:vgpr_32 = IMPLICIT_DEF
> +# GCN: SI_INIT_M0 65536
> +# GCN-NEXT: S_NOP
> +---
> +name: move-m0-different-initializer
> +registers:
> +  - { id: 0, class: vgpr_32 }
> +  - { id: 1, class: vgpr_32 }
> +body:             |
> +  bb.0:
> +    %0 = IMPLICIT_DEF
> +    %1 = IMPLICIT_DEF
> +    SI_INIT_M0 -1, implicit-def $m0
> +    DS_WRITE_B32 %0, %1, 0, 0, implicit $m0, implicit $exec
> +    S_NOP 0
> +    SI_INIT_M0 65536, implicit-def $m0
> +    DS_WRITE_B32 %0, %1, 0, 0, implicit $m0, implicit $exec
> +...
> +
> +# GCN-LABEL: name: move-m0-schedule-boundary
> +# GCN: S_SETREG
> +# GCN-NEXT: SI_INIT_M0 -1
> +---
> +name: move-m0-schedule-boundary
> +registers:
> +  - { id: 0, class: vgpr_32 }
> +  - { id: 1, class: vgpr_32 }
> +body:             |
> +  bb.0:
> +    %0 = IMPLICIT_DEF
> +    %1 = IMPLICIT_DEF
> +    S_SETREG_IMM32_B32 0, 1
> +    SI_INIT_M0 -1, implicit-def $m0
> +    DS_WRITE_B32 %0, %1, 0, 0, implicit $m0, implicit $exec
> +...
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20190912/2290225d/attachment.html>


More information about the llvm-commits mailing list