[all-commits] [llvm/llvm-project] 555d8f: [AMDGPU] Bundle loads before post-RA scheduler

Stanislav Mekhanoshin via All-commits all-commits at lists.llvm.org
Fri Jan 24 11:33:54 PST 2020


  Branch: refs/heads/master
  Home:   https://github.com/llvm/llvm-project
  Commit: 555d8f4ef5ebb2cdce2636af5102ff944da5fef8
      https://github.com/llvm/llvm-project/commit/555d8f4ef5ebb2cdce2636af5102ff944da5fef8
  Author: Stanislav Mekhanoshin <Stanislav.Mekhanoshin at amd.com>
  Date:   2020-01-24 (Fri, 24 Jan 2020)

  Changed paths:
    M llvm/lib/Target/AMDGPU/AMDGPU.h
    M llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
    M llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
    M llvm/lib/Target/AMDGPU/CMakeLists.txt
    M llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
    M llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp
    A llvm/lib/Target/AMDGPU/SIPostRABundler.cpp
    M llvm/test/CodeGen/AMDGPU/GlobalISel/insertelement.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.atomic.dec.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.atomic.inc.ll
    M llvm/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll
    M llvm/test/CodeGen/AMDGPU/byval-frame-setup.ll
    M llvm/test/CodeGen/AMDGPU/call-argument-types.ll
    M llvm/test/CodeGen/AMDGPU/callee-special-input-vgprs.ll
    M llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll
    M llvm/test/CodeGen/AMDGPU/cvt_f32_ubyte.ll
    M llvm/test/CodeGen/AMDGPU/ds_write2st64.ll
    M llvm/test/CodeGen/AMDGPU/global-saddr.ll
    M llvm/test/CodeGen/AMDGPU/half.ll
    M llvm/test/CodeGen/AMDGPU/idot2.ll
    M llvm/test/CodeGen/AMDGPU/idot4s.ll
    M llvm/test/CodeGen/AMDGPU/idot4u.ll
    M llvm/test/CodeGen/AMDGPU/idot8s.ll
    M llvm/test/CodeGen/AMDGPU/idot8u.ll
    M llvm/test/CodeGen/AMDGPU/insert_vector_elt.ll
    M llvm/test/CodeGen/AMDGPU/insert_vector_elt.v2i16.ll
    M llvm/test/CodeGen/AMDGPU/llvm.maxnum.f16.ll
    M llvm/test/CodeGen/AMDGPU/llvm.minnum.f16.ll
    M llvm/test/CodeGen/AMDGPU/llvm.round.f64.ll
    M llvm/test/CodeGen/AMDGPU/load-lo16.ll
    M llvm/test/CodeGen/AMDGPU/local-memory.amdgcn.ll
    M llvm/test/CodeGen/AMDGPU/lshr.v2i16.ll
    M llvm/test/CodeGen/AMDGPU/max.i16.ll
    M llvm/test/CodeGen/AMDGPU/memory-legalizer-load.ll
    M llvm/test/CodeGen/AMDGPU/memory_clause.ll
    M llvm/test/CodeGen/AMDGPU/merge-store-crash.ll
    A llvm/test/CodeGen/AMDGPU/postra-bundle-memops.mir
    M llvm/test/CodeGen/AMDGPU/promote-constOffset-to-imm.ll
    M llvm/test/CodeGen/AMDGPU/saddo.ll
    M llvm/test/CodeGen/AMDGPU/salu-to-valu.ll
    M llvm/test/CodeGen/AMDGPU/scratch-simple.ll
    M llvm/test/CodeGen/AMDGPU/select.f16.ll
    M llvm/test/CodeGen/AMDGPU/shl.ll
    M llvm/test/CodeGen/AMDGPU/shl.v2i16.ll
    M llvm/test/CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll
    M llvm/test/CodeGen/AMDGPU/sign_extend.ll
    M llvm/test/CodeGen/AMDGPU/sminmax.v2i16.ll
    M llvm/test/CodeGen/AMDGPU/store-weird-sizes.ll
    M llvm/test/CodeGen/AMDGPU/v_mac_f16.ll
    M llvm/test/CodeGen/AMDGPU/v_madak_f16.ll

  Log Message:
  -----------
  [AMDGPU] Bundle loads before post-RA scheduler

We are relying on atrificial DAG edges inserted by the
MemOpClusterMutation to keep loads and stores together in the
post-RA scheduler. This does not work all the time since it
allows to schedule a completely independent instruction in the
middle of the cluster.

Removed the DAG mutation and added pass to bundle already
clustered instructions. These bundles are unpacked before the
memory legalizer because it does not work with bundles but also
because it allows to insert waitcounts in the middle of a store
cluster.

Removing artificial edges also allows a more relaxed scheduling.

Differential Revision: https://reviews.llvm.org/D72737




More information about the All-commits mailing list