[PATCH] D11885: AMDGPU/SI: Add SI Machine Scheduler
Axel Davy via llvm-commits
llvm-commits at lists.llvm.org
Wed Sep 30 01:07:52 PDT 2015
axeldavy added inline comments.
================
Comment at: lib/Target/AMDGPU/SIInstrInfo.cpp:2732-2734
@@ +2731,5 @@
+
+ if (isMUBUF(Opc) || isMTBUF(Opc) || isMIMG(Opc))
+ return true;
+ return false;
+}
----------------
arsenm wrote:
> This can return the condition without the if.
>
> This also isn't 100% accurate. This will miss barriers and a few other cases. Better would be to check the instruction latency maybe with a threshold value of some sort.
The instruction latency for all instructions is not filled yet, we cannot rely on it.
I don't think we need to say barriers are high latencies, because (unless I'm wrong)
they will separate the scheduling regions (the instruction is not moved).
================
Comment at: lib/Target/AMDGPU/SIMachineScheduler.cpp:128
@@ +127,3 @@
+ switch (Reason) {
+ case NoCand: return "NOCAND ";
+ case RegUsage: return "REGUSAGE ";
----------------
arsenm wrote:
> Why are these padded with spaces at the end?
looks like for now we don't need extra spacing. I'll remove them for the next patch.
================
Comment at: lib/Target/AMDGPU/SIMachineScheduler.cpp:174
@@ +173,3 @@
+
+// SIScheduleBlock //
+
----------------
arsenm wrote:
> Looks like a stray comment
I just used it to signal below are the SIScheduleBlock functions. I'm ok to remove it if you don't like it.
================
Comment at: lib/Target/AMDGPU/SIMachineScheduler.cpp:291
@@ +290,3 @@
+// Returns if the register was set between first and last
+static bool findDefBetween(unsigned Reg,
+ SlotIndex First, SlotIndex Last,
----------------
arsenm wrote:
> Function would be better named isDefBetween since it doesn't return the def.
>
> Something about this function doesn't seem right to me. I don't think you should be looking at the def_instructions, and instead checking the Segments of the LiveRange.
The function is an adaptation of RegistrePressure.cpp 's findUseBetween function, which was doing the same but with uses.
For Uses it makes sense to use use_instr_nodbg_begin because there doesn't seem to be other way with the LiveRange.
For Defs it looks like that it can be done with LiveRange.
Do you really want I try to use LiveRange instead ?
This version with def_instr_begin works and is fast enough.
================
Comment at: lib/Target/AMDGPU/SIMachineScheduler.cpp:295-297
@@ +294,5 @@
+ const LiveIntervals *LIS) {
+ for (MachineRegisterInfo::def_instr_iterator
+ UI = MRI->def_instr_begin(Reg),
+ UE = MRI->def_instr_end(); UI != UE; ++UI) {
+ const MachineInstr* MI = &*UI;
----------------
arsenm wrote:
> This can be a range loop over MRI->def_instructions()
I didn't know about range for before you advised me to use them.
Is is possible to use them here, knowing there is a cast to MachineInstr* ?
================
Comment at: lib/Target/AMDGPU/SIMachineScheduler.cpp:517
@@ +516,3 @@
+ std::map<unsigned, unsigned>::iterator I2 = NodeNum2Index.find(SU->NodeNum);
+ if (I2 != NodeNum2Index.end())
+ HasLowLatencyNonWaitedParent[I2->second] = 1;
----------------
There was a mistake here, it shouldn't be SU->NodeNum but the Successor NodeNum.
My next patch will fix it.
http://reviews.llvm.org/D11885
More information about the llvm-commits
mailing list