[llvm] [AMDGPU] Make max dwords of memory cluster configurable (PR #119342)
via llvm-commits
llvm-commits at lists.llvm.org
Mon Dec 16 19:23:38 PST 2024
================
@@ -559,33 +554,39 @@ bool SIInstrInfo::shouldClusterMemOps(ArrayRef<const MachineOperand *> BaseOps1,
unsigned NumBytes) const {
// If the mem ops (to be clustered) do not have the same base ptr, then they
// should not be clustered
+ unsigned MaxMemoryClusterDWords = 8;
if (!BaseOps1.empty() && !BaseOps2.empty()) {
const MachineInstr &FirstLdSt = *BaseOps1.front()->getParent();
const MachineInstr &SecondLdSt = *BaseOps2.front()->getParent();
if (!memOpsHaveSameBasePtr(FirstLdSt, BaseOps1, SecondLdSt, BaseOps2))
return false;
+
+ const SIMachineFunctionInfo *MFI =
+ FirstLdSt.getMF()->getInfo<SIMachineFunctionInfo>();
+ if (MFI->getMaxMemoryClusterDWords())
+ MaxMemoryClusterDWords = MFI->getMaxMemoryClusterDWords();
----------------
ruiling wrote:
It's about maintaining the default value `8`. Without MFI, `shouldClusterMemOps()` should still have a copy of the default value `8`, right? If we directly take the `MFI->getMaxMemoryClusterDWords()`, that would mean we need to keep the default value in both `MFI` and within `shouldClusterMemOps()`. Do we want that?
https://github.com/llvm/llvm-project/pull/119342
More information about the llvm-commits
mailing list