[PATCH] D84354: [AMDGPU/MemOpsCluster] Clean-up fixme's around mem ops clustering logic

Mahesha S via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jul 22 12:42:35 PDT 2020


hsmhsm created this revision.
hsmhsm added reviewers: foad, arsenm, cfang, rampitec, nhaehnle, hliao.
Herald added subscribers: llvm-commits, kerbowa, hiraditya, t-tye, tpr, dstuttard, yaxunl, wdng, jvesely, kzhuravl.
Herald added a project: LLVM.

(1) Get rid of `num-clustered-mem-ops`from using within heuristic,
(2) Base heuristic purely on `num-clustered-bytes`, (3) Set max-clustered-bytes
to 32 bytes. The main intuition behind this is as follows. The existing
heuristic roughly summarizes as below:

- Assume, all the mem ops instructions participating in the clustering process, loads/stores same num bytes
- If num bytes loaded by each mem op is 4 bytes, then cluster at max 5 mem ops, that is at max 20 bytes
- If num bytes loaded by each mem op is 8 bytes, then cluster at max 3 mem ops, that is at max 24 bytes
- If num bytes loaded by each mem op is 16 bytes, then cluster at max 2 mem ops, that is at max 32 bytes

So, by setting max-clustered-bytes to 32 bytes, we can safely approximate the
existing logic, without the possibility of any serious performance regressions.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D84354

Files:
  llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
  llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.ubfe.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/load-constant.96.ll
  llvm/test/CodeGen/AMDGPU/amdhsa-trap-num-sgprs.ll
  llvm/test/CodeGen/AMDGPU/call-argument-types.ll
  llvm/test/CodeGen/AMDGPU/cttz_zero_undef.ll
  llvm/test/CodeGen/AMDGPU/cvt_f32_ubyte.ll
  llvm/test/CodeGen/AMDGPU/fshr.ll
  llvm/test/CodeGen/AMDGPU/global-saddr.ll
  llvm/test/CodeGen/AMDGPU/indirect-addressing-si.ll
  llvm/test/CodeGen/AMDGPU/kernel-args.ll
  llvm/test/CodeGen/AMDGPU/llvm.round.f64.ll
  llvm/test/CodeGen/AMDGPU/promote-constOffset-to-imm.ll
  llvm/test/CodeGen/AMDGPU/sgpr-control-flow.ll
  llvm/test/CodeGen/AMDGPU/store-weird-sizes.ll
  llvm/test/CodeGen/AMDGPU/udivrem.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D84354.279918.patch
Type: text/x-patch
Size: 106769 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200722/b0d5e9f7/attachment.bin>


More information about the llvm-commits mailing list