[all-commits] [llvm/llvm-project] 40a632: [AMDGPU/MemOpsCluster] Implement new heuristic for...

Tue Jun 9 01:39:50 PDT 2020

  Branch: refs/heads/master
  Home:   https://github.com/llvm/llvm-project
  Commit: 40a632a335119fe3e8d5d500a5d2641998314ecb
      https://github.com/llvm/llvm-project/commit/40a632a335119fe3e8d5d500a5d2641998314ecb
  Author: hsmahesha <mahesha.comp at gmail.com>
  Date:   2020-06-09 (Tue, 09 Jun 2020)

  Changed paths:
    M llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
    M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.div.fmas.ll
    M llvm/test/CodeGen/AMDGPU/amdhsa-trap-num-sgprs.ll
    M llvm/test/CodeGen/AMDGPU/global-saddr.ll
    M llvm/test/CodeGen/AMDGPU/insert_vector_elt.ll
    M llvm/test/CodeGen/AMDGPU/kernel-args.ll
    M llvm/test/CodeGen/AMDGPU/memory_clause.ll
    M llvm/test/CodeGen/AMDGPU/promote-constOffset-to-imm.ll
    M llvm/test/CodeGen/AMDGPU/salu-to-valu.ll
    M llvm/test/CodeGen/AMDGPU/sgpr-control-flow.ll
    M llvm/test/CodeGen/AMDGPU/shift-i128.ll
    M llvm/test/CodeGen/AMDGPU/store-weird-sizes.ll
    M llvm/test/CodeGen/AMDGPU/trunc-store-i64.ll

  Log Message:
  -----------
  [AMDGPU/MemOpsCluster] Implement new heuristic for computing max mem ops cluster size

Summary:
Make use of both the - (1) clustered bytes and (2) cluster length, to decide on
the max number of mem ops that can be clustered. On an average, when loads
are dword or smaller, consider `5` as max threshold, otherwise `4`. This heuristic
is purely based on different experimentation conducted, and there is no analytical
logic here.

Reviewers: foad, rampitec, arsenm, vpykhtin

Reviewed By: foad, rampitec

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, Anastasia, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81085