[PATCH] D84354: [AMDGPU/MemOpsCluster] Clean-up fixme's around mem ops clustering logic
Mahesha S via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Jul 22 12:42:35 PDT 2020
hsmhsm created this revision.
hsmhsm added reviewers: foad, arsenm, cfang, rampitec, nhaehnle, hliao.
Herald added subscribers: llvm-commits, kerbowa, hiraditya, t-tye, tpr, dstuttard, yaxunl, wdng, jvesely, kzhuravl.
Herald added a project: LLVM.
(1) Get rid of `num-clustered-mem-ops`from using within heuristic,
(2) Base heuristic purely on `num-clustered-bytes`, (3) Set max-clustered-bytes
to 32 bytes. The main intuition behind this is as follows. The existing
heuristic roughly summarizes as below:
- Assume, all the mem ops instructions participating in the clustering process, loads/stores same num bytes
- If num bytes loaded by each mem op is 4 bytes, then cluster at max 5 mem ops, that is at max 20 bytes
- If num bytes loaded by each mem op is 8 bytes, then cluster at max 3 mem ops, that is at max 24 bytes
- If num bytes loaded by each mem op is 16 bytes, then cluster at max 2 mem ops, that is at max 32 bytes
So, by setting max-clustered-bytes to 32 bytes, we can safely approximate the
existing logic, without the possibility of any serious performance regressions.
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D84354
Files:
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.ubfe.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/load-constant.96.ll
llvm/test/CodeGen/AMDGPU/amdhsa-trap-num-sgprs.ll
llvm/test/CodeGen/AMDGPU/call-argument-types.ll
llvm/test/CodeGen/AMDGPU/cttz_zero_undef.ll
llvm/test/CodeGen/AMDGPU/cvt_f32_ubyte.ll
llvm/test/CodeGen/AMDGPU/fshr.ll
llvm/test/CodeGen/AMDGPU/global-saddr.ll
llvm/test/CodeGen/AMDGPU/indirect-addressing-si.ll
llvm/test/CodeGen/AMDGPU/kernel-args.ll
llvm/test/CodeGen/AMDGPU/llvm.round.f64.ll
llvm/test/CodeGen/AMDGPU/promote-constOffset-to-imm.ll
llvm/test/CodeGen/AMDGPU/sgpr-control-flow.ll
llvm/test/CodeGen/AMDGPU/store-weird-sizes.ll
llvm/test/CodeGen/AMDGPU/udivrem.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D84354.279918.patch
Type: text/x-patch
Size: 106769 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200722/b0d5e9f7/attachment.bin>
More information about the llvm-commits
mailing list