[all-commits] [llvm/llvm-project] 33fd4a: [AMDGPU/MemOpsCluster] Clean-up fixme's around mem...
Mahesha S via All-commits
all-commits at lists.llvm.org
Thu Jul 30 09:11:57 PDT 2020
Branch: refs/heads/master
Home: https://github.com/llvm/llvm-project
Commit: 33fd4a18e7d373344c8af0012dd97c1c739f2916
https://github.com/llvm/llvm-project/commit/33fd4a18e7d373344c8af0012dd97c1c739f2916
Author: hsmahesha <mahesha.comp at gmail.com>
Date: 2020-07-30 (Thu, 30 Jul 2020)
Changed paths:
M llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.ubfe.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/load-constant.96.ll
M llvm/test/CodeGen/AMDGPU/amdhsa-trap-num-sgprs.ll
M llvm/test/CodeGen/AMDGPU/call-argument-types.ll
M llvm/test/CodeGen/AMDGPU/cttz_zero_undef.ll
M llvm/test/CodeGen/AMDGPU/cvt_f32_ubyte.ll
M llvm/test/CodeGen/AMDGPU/fshr.ll
M llvm/test/CodeGen/AMDGPU/indirect-addressing-si.ll
M llvm/test/CodeGen/AMDGPU/kernel-args.ll
M llvm/test/CodeGen/AMDGPU/llvm.round.f64.ll
M llvm/test/CodeGen/AMDGPU/merge-stores.ll
M llvm/test/CodeGen/AMDGPU/promote-constOffset-to-imm.ll
M llvm/test/CodeGen/AMDGPU/store-weird-sizes.ll
M llvm/test/CodeGen/AMDGPU/token-factor-inline-limit-test.ll
M llvm/test/CodeGen/AMDGPU/udivrem.ll
Log Message:
-----------
[AMDGPU/MemOpsCluster] Clean-up fixme's around mem ops clustering logic
Get rid of all fixmes and base heuristic on `num-clustered-dwords`. The main intuition behind this is as
follows. The existing heuristic roughly summarizes as below:
* Assume, all the mem ops instructions participating in the clustering process, loads/stores same num bytes
* If num bytes loaded by each mem op is 4 bytes, then cluster at max 5 mem ops, that is at max 20 bytes
* If num bytes loaded by each mem op is 8 bytes, then cluster at max 3 mem ops, that is at max 24 bytes
* If num bytes loaded by each mem op is 16 bytes, then cluster at max 2 mem ops, that is at max 32 bytes
So, we need to make sure that the new heuristic do not completey deviate away from the above one, and it
properly handles both the sub-word loads and the wide loads.
Reviewed By: arsenm, rampitec
Differential Revision: https://reviews.llvm.org/D84354
More information about the All-commits
mailing list