[PATCH] D18427: CodeGen: Add DetectDeadLanes pass.

Matthias Braun via llvm-commits llvm-commits at lists.llvm.org
Tue Mar 29 15:16:10 PDT 2016


MatzeB added inline comments.

================
Comment at: lib/CodeGen/DetectDeadLanes.cpp:25
@@ +24,3 @@
+///             use %vreg3
+/// The %vreg0 definition is dead and %vreg3 contains an undefined value.
+//
----------------
qcolombet wrote:
> How do we end up generating this kind of code?
> 
> Although I agree we need to support it, we should try not to generate that.
The example here is artificially designed to require a minimum amount of instructions to demonstrate the issue.

It's a good question however where the real issues come from, I'll take a look around.

================
Comment at: lib/CodeGen/DetectDeadLanes.cpp:108
@@ +107,3 @@
+  // are not lowered to a COPY.
+  return MI.isCopy() || MI.isPHI() || MI.isInsertSubreg() ||
+    MI.isRegSequence() || MI.isExtractSubreg();
----------------
qcolombet wrote:
> A remark: since the goal right now is just to fix up the representation for the register coalescer, there is no value in supporting those.
> However, if we go toward teaching this pass dead code elimination, then yes, it would make sense to do that.
There is still some value as it improves the precision of detecting unused/dead lanes. However from what I can see the ARM target is the only one implementing these instructions and the pass doesn't even run for this target, so I leave the support for this as a future enhancement.

================
Comment at: lib/CodeGen/DetectDeadLanes.cpp:135
@@ +134,3 @@
+  if (DefinedByCopy[MORegIdx] && !InWorkList[MORegIdx])
+    addToWorklist(MORegIdx);
+}
----------------
qcolombet wrote:
> I would have expected the check with InWorkList to be hidden into addToWorklist.
Micro optimization because in the initial pass below I can insert without even checking if the value is already in the list. Anyway with the switch to SetVector this point is moot.

================
Comment at: lib/CodeGen/DetectDeadLanes.cpp:148
@@ +147,3 @@
+    unsigned I = MI.getNumOperands()-2;
+    for (;;) {
+      const MachineOperand &MO = MI.getOperand(I);
----------------
qcolombet wrote:
> What is the benefit of going backward?
> I found it confusing to have a "for(;;)" for this traversal.
> 
> I would have expected something like:
> for (unsigned I = 1, E = MI.getNumOperands(); I != E; I += 2)
If the subregister in the REG_SEQUENCE overlap each other (I couldn't find any evidence that this is not allowed), then later operands may override parts of the earlier operands. That's why this loop moves backwards and reduces the UsedLanes bitmask with each operand.

================
Comment at: lib/CodeGen/DetectDeadLanes.cpp:188
@@ +187,3 @@
+  const MachineInstr &MI = *Use.getParent();
+  if (MI.getDesc().getNumDefs() != 1)
+    return;
----------------
qcolombet wrote:
> qcolombet wrote:
> > Add a comment why we limit ourselves to instruction with only one def.
> > For future reference, we need to know if this is a limitation or just that it does not make sense to support more than one definition.
> What happens with inlineasm?
> I.e., we do not get the actual number of definitions via the static description.
> Explain why it is okay.
At this point we only care about lowersToCopy() instructions. They are the only ones that need subsequent transfer/updating all other operands have been handled in the first pass already.
I added a comment explaining this as the check for the copy-likeness comes later in the form of the !DefinedByCopy[X].

================
Comment at: lib/CodeGen/DetectDeadLanes.cpp:252
@@ +251,3 @@
+    DEBUG(dbgs() << "Skipping Detect dead lanes pass\n");
+    return false;
+  }
----------------
qcolombet wrote:
> The motivating example in the comment at the beginning of the file does not need sub register liveness to produce dead definitions after coalescing, does it?
> 
> I.e., is this true we do not need that pass when enableSubRegLiveness is false?
This pass does not use any sub-register liveness. My idea here was that it is probably not worth spending the compiletime on this pass for a target that does not even care to enable subregister liveness (though admittedly the pass is quite fast and more powerful than DeadMachineInstrElim because it finds the dataflow fixpoint). You think it is a good idea to enable for all targets, I measured it usually taking around 0.5% compiletime.

================
Comment at: lib/CodeGen/DetectDeadLanes.cpp:305
@@ +304,3 @@
+        } else {
+          continue;
+        }
----------------
qcolombet wrote:
> What does this else block covers?
> A register is either virtual or physical, right?
> 
> What am I missing?
The operand can also be 0 (aka. "NoRegister"). Added a comment.


Repository:
  rL LLVM

http://reviews.llvm.org/D18427





More information about the llvm-commits mailing list