[all-commits] [llvm/llvm-project] 555d8f: [AMDGPU] Bundle loads before post-RA scheduler
Stanislav Mekhanoshin via All-commits
all-commits at lists.llvm.org
Fri Jan 24 11:33:54 PST 2020
Branch: refs/heads/master
Home: https://github.com/llvm/llvm-project
Commit: 555d8f4ef5ebb2cdce2636af5102ff944da5fef8
https://github.com/llvm/llvm-project/commit/555d8f4ef5ebb2cdce2636af5102ff944da5fef8
Author: Stanislav Mekhanoshin <Stanislav.Mekhanoshin at amd.com>
Date: 2020-01-24 (Fri, 24 Jan 2020)
Changed paths:
M llvm/lib/Target/AMDGPU/AMDGPU.h
M llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
M llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
M llvm/lib/Target/AMDGPU/CMakeLists.txt
M llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
M llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp
A llvm/lib/Target/AMDGPU/SIPostRABundler.cpp
M llvm/test/CodeGen/AMDGPU/GlobalISel/insertelement.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.atomic.dec.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.atomic.inc.ll
M llvm/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll
M llvm/test/CodeGen/AMDGPU/byval-frame-setup.ll
M llvm/test/CodeGen/AMDGPU/call-argument-types.ll
M llvm/test/CodeGen/AMDGPU/callee-special-input-vgprs.ll
M llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll
M llvm/test/CodeGen/AMDGPU/cvt_f32_ubyte.ll
M llvm/test/CodeGen/AMDGPU/ds_write2st64.ll
M llvm/test/CodeGen/AMDGPU/global-saddr.ll
M llvm/test/CodeGen/AMDGPU/half.ll
M llvm/test/CodeGen/AMDGPU/idot2.ll
M llvm/test/CodeGen/AMDGPU/idot4s.ll
M llvm/test/CodeGen/AMDGPU/idot4u.ll
M llvm/test/CodeGen/AMDGPU/idot8s.ll
M llvm/test/CodeGen/AMDGPU/idot8u.ll
M llvm/test/CodeGen/AMDGPU/insert_vector_elt.ll
M llvm/test/CodeGen/AMDGPU/insert_vector_elt.v2i16.ll
M llvm/test/CodeGen/AMDGPU/llvm.maxnum.f16.ll
M llvm/test/CodeGen/AMDGPU/llvm.minnum.f16.ll
M llvm/test/CodeGen/AMDGPU/llvm.round.f64.ll
M llvm/test/CodeGen/AMDGPU/load-lo16.ll
M llvm/test/CodeGen/AMDGPU/local-memory.amdgcn.ll
M llvm/test/CodeGen/AMDGPU/lshr.v2i16.ll
M llvm/test/CodeGen/AMDGPU/max.i16.ll
M llvm/test/CodeGen/AMDGPU/memory-legalizer-load.ll
M llvm/test/CodeGen/AMDGPU/memory_clause.ll
M llvm/test/CodeGen/AMDGPU/merge-store-crash.ll
A llvm/test/CodeGen/AMDGPU/postra-bundle-memops.mir
M llvm/test/CodeGen/AMDGPU/promote-constOffset-to-imm.ll
M llvm/test/CodeGen/AMDGPU/saddo.ll
M llvm/test/CodeGen/AMDGPU/salu-to-valu.ll
M llvm/test/CodeGen/AMDGPU/scratch-simple.ll
M llvm/test/CodeGen/AMDGPU/select.f16.ll
M llvm/test/CodeGen/AMDGPU/shl.ll
M llvm/test/CodeGen/AMDGPU/shl.v2i16.ll
M llvm/test/CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll
M llvm/test/CodeGen/AMDGPU/sign_extend.ll
M llvm/test/CodeGen/AMDGPU/sminmax.v2i16.ll
M llvm/test/CodeGen/AMDGPU/store-weird-sizes.ll
M llvm/test/CodeGen/AMDGPU/v_mac_f16.ll
M llvm/test/CodeGen/AMDGPU/v_madak_f16.ll
Log Message:
-----------
[AMDGPU] Bundle loads before post-RA scheduler
We are relying on atrificial DAG edges inserted by the
MemOpClusterMutation to keep loads and stores together in the
post-RA scheduler. This does not work all the time since it
allows to schedule a completely independent instruction in the
middle of the cluster.
Removed the DAG mutation and added pass to bundle already
clustered instructions. These bundles are unpacked before the
memory legalizer because it does not work with bundles but also
because it allows to insert waitcounts in the middle of a store
cluster.
Removing artificial edges also allows a more relaxed scheduling.
Differential Revision: https://reviews.llvm.org/D72737
More information about the All-commits
mailing list