[PATCH] D82393: [AMDGPU/MemOpsCluster] Implement new heuristic for computing max mem ops cluster size
Mahesha S via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jun 23 10:10:35 PDT 2020
hsmhsm created this revision.
hsmhsm added reviewers: foad, rampitec, arsenm, vpykhtin.
hsmhsm added a project: LLVM.
Make use of both the - (1) clustered bytes and (2) cluster length, to decide on
the max number of mem ops that can be clustered. On an average, when loads
are dword or smaller, consider `5` as max threshold, otherwise `4`. This
heuristic is purely based on different experimentation conducted, and there is
no analytical logic here.
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D82393
Files:
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.div.fmas.ll
llvm/test/CodeGen/AMDGPU/amdhsa-trap-num-sgprs.ll
llvm/test/CodeGen/AMDGPU/global-saddr.ll
llvm/test/CodeGen/AMDGPU/insert_vector_elt.ll
llvm/test/CodeGen/AMDGPU/kernel-args.ll
llvm/test/CodeGen/AMDGPU/memory_clause.ll
llvm/test/CodeGen/AMDGPU/promote-constOffset-to-imm.ll
llvm/test/CodeGen/AMDGPU/salu-to-valu.ll
llvm/test/CodeGen/AMDGPU/sgpr-control-flow.ll
llvm/test/CodeGen/AMDGPU/shift-i128.ll
llvm/test/CodeGen/AMDGPU/store-weird-sizes.ll
llvm/test/CodeGen/AMDGPU/trunc-store-i64.ll
llvm/test/CodeGen/AMDGPU/udivrem.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D82393.272765.patch
Type: text/x-patch
Size: 56138 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200623/4b52a9e6/attachment.bin>
More information about the llvm-commits
mailing list