[PATCH] D23814: AMDGPU/SI: Improve SILoadStoreOptimizer and run it before the scheduler
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Tue Aug 23 14:45:36 PDT 2016
arsenm added inline comments.
================
Comment at: lib/Target/AMDGPU/AMDGPUTargetMachine.cpp:499
@@ -498,1 +498,3 @@
addPass(&DeadMachineInstructionElimID);
+ if (getOptLevel() > CodeGenOpt::None) {
+ // Don't do this with no optimizations since it throws away debug info by
----------------
This isn't needed here since addMachineSSAOptimization isn't called with None anyway
================
Comment at: lib/Target/AMDGPU/AMDGPUTargetMachine.cpp:547-552
@@ -541,13 +546,8 @@
if (getOptLevel() > CodeGenOpt::None) {
- // Don't do this with no optimizations since it throws away debug info by
- // merging nonadjacent loads.
-
- // This should be run after scheduling, but before register allocation. It
- // also need extra copies to the address operand to be eliminated.
-
- // FIXME: Move pre-RA and remove extra reg coalescer run.
- insertPass(&MachineSchedulerID, &SILoadStoreOptimizerID);
+ // FIXME: Remove extra reg coalescer run. This was required when
+ // we ran the SILoadStoreOptimizer here, but removing it blocks
+ // some of the optimizations in the SIShrinkInstructionsPass.
insertPass(&MachineSchedulerID, &RegisterCoalescerID);
}
----------------
The extra run should be removed now
================
Comment at: lib/Target/AMDGPU/SILoadStoreOptimizer.cpp:78
@@ -76,1 +77,3 @@
+ unsigned EltSize,
+ const std::vector<MachineInstr*> &InstsToMove);
----------------
ArrayRef
================
Comment at: lib/Target/AMDGPU/SILoadStoreOptimizer.cpp:84
@@ -81,1 +83,3 @@
+ unsigned EltSize,
+ const std::vector<MachineInstr*> &InstsToMove);
----------------
ArrayRef
================
Comment at: lib/Target/AMDGPU/SILoadStoreOptimizer.cpp:107
@@ -102,6 +106,3 @@
AU.setPreservesCFG();
- AU.addPreserved<SlotIndexes>();
- AU.addPreserved<LiveIntervals>();
- AU.addPreserved<LiveVariables>();
- AU.addRequired<LiveIntervals>();
+ AU.addRequired<AAResultsWrapperPass>();
----------------
Does this need to be required? If it's not required is it available in the current pass pipeline? I think there's a subtarget hook we needed to enable to use this
================
Comment at: lib/Target/AMDGPU/SILoadStoreOptimizer.cpp:198
@@ -164,25 +197,3 @@
- if (MBBI == MBB.end() || MBBI->getOpcode() != I->getOpcode())
- return E;
-
- // Don't merge volatiles.
- if (MBBI->hasOrderedMemoryRef())
- return E;
-
- int AddrIdx = AMDGPU::getNamedOperandIdx(I->getOpcode(), AMDGPU::OpName::addr);
- const MachineOperand &AddrReg0 = I->getOperand(AddrIdx);
- const MachineOperand &AddrReg1 = MBBI->getOperand(AddrIdx);
-
- // Check same base pointer. Be careful of subregisters, which can occur with
- // vectors of pointers.
- if (AddrReg0.getReg() == AddrReg1.getReg() &&
- AddrReg0.getSubReg() == AddrReg1.getSubReg()) {
- int OffsetIdx = AMDGPU::getNamedOperandIdx(I->getOpcode(),
- AMDGPU::OpName::offset);
- unsigned Offset0 = I->getOperand(OffsetIdx).getImm() & 0xffff;
- unsigned Offset1 = MBBI->getOperand(OffsetIdx).getImm() & 0xffff;
-
- // Check both offsets fit in the reduced range.
- if (offsetsCanBeCombined(Offset0, Offset1, EltSize))
- return MBBI;
- }
+ std::vector<const MachineOperand *> DefsToMove;
+ addDefsToList(*I, DefsToMove);
----------------
Should be a SmallVector
================
Comment at: lib/Target/AMDGPU/SILoadStoreOptimizer.cpp:226-249
@@ -189,1 +225,26 @@
+
+ // When we match I with another DS instruction we will be moving I down
+ // to the location of the matched instruction any uses of I will need to
+ // be moved down as well.
+ for (const MachineOperand *Def : DefsToMove) {
+ std::pair<bool, bool> ReadWrite =
+ MBBI->readsWritesVirtualRegister(Def->getReg());
+ if (ReadWrite.second)
+ // This means Def and MBBI both write to the same super register.
+ // FIXME: This is probably OK, but we may need some additional logic
+ // to make sure the sub-indices of the writes don't overlap.
+ return E;
+
+ // If ReadWrite.first is true, then there is a use of Def between I
+ // and the instruction that I will potentially be merged with, we
+ // will need to move this instruction after the merged instructions.
+ if (ReadWrite.first) {
+ InstsToMove.push_back(&*MBBI);
+ addDefsToList(*MBBI, DefsToMove);
+ break;
+ }
+ }
+ continue;
+ }
+ // Don't merge volatiles.
----------------
It is not possible to have a sub register def in SSA so I don't think you need any of this
https://reviews.llvm.org/D23814
More information about the llvm-commits
mailing list