[PATCH] D18572: [AArch64] Relax branches by fusing compare with conditional branch when we can infer that source register is zero/non-zero.

Balaram Makam via llvm-commits llvm-commits at lists.llvm.org
Wed Mar 30 02:57:04 PDT 2016


bmakam added a comment.

Thanks for the comments Tim. Please see my replies inline.


================
Comment at: lib/Target/AArch64/AArch64BranchRelaxation.cpp:483
@@ +482,3 @@
+  if (!Compare->getOperand(1).isReg() || !Compare->getOperand(2).isImm() ||
+      Compare->getOperand(2).getImm() != 1)
+    return false;
----------------
t.p.northover wrote:
> What about "cmp reg, #0"/"b.le ..."?
This could be turned into a TBNZ but isn't AArch64ConditionOptimizer a better place to handle this? Although this is very similar we only fuse a compare with branch in this patch i.e.  (a>0 && a<1) -> a == 0 or (a>0 && a>= 1) -> a != 0 and so it depends on branchfolding to shape the CFG.

================
Comment at: lib/Target/AArch64/AArch64BranchRelaxation.cpp:491
@@ +490,3 @@
+  MachineBasicBlock &MBB = *Compare->getParent();
+  unsigned SrcReg = Compare->getOperand(0).getReg();
+  unsigned SrcReg2 = Compare->getOperand(1).getReg();
----------------
t.p.northover wrote:
> Operand 0 is a destination, isn't it? Usually XZR/WZR in a true compare. Actually, I think we want to make sure it's <dead> if we're going to be removing this definition.
You are correct. I agree, will update in my next patch.

================
Comment at: lib/Target/AArch64/AArch64BranchRelaxation.cpp:510
@@ +509,3 @@
+          PI.getOperand(0).isUse() && PI.getOperand(0).getReg() == SrcReg2)
+        LastUse = &PI;
+    }
----------------
t.p.northover wrote:
> This seems wrong on 2 levels. First, if there are multiple predecessors then the fact that one of them ends in a TBZ says nothing about the contents of SrcReg2 coming from any others.
> 
> But even if that wasn't the case we really wouldn't want the logic to rely on the order MBB's predecessors happen to be returned in.
I see your point now. I actually wanted to get the immediate dominator but the dom tree is not available in this pass without reconstructing. I will refactor this.

================
Comment at: lib/Target/AArch64/AArch64BranchRelaxation.cpp:568
@@ +567,3 @@
+
+    bool CompleteNZCVUsers = !isNZCVLiveOut(MBB);
+    SmallVector<MachineInstr *, 4> NZCVUsers;
----------------
t.p.northover wrote:
> Related to two other comments about this loop: this seems like an immediate continue condition. In general, it looks likely that once you prune out the extra logic `CompleteNZCVUsers` will be unnecessary.
I will fix this.


http://reviews.llvm.org/D18572





More information about the llvm-commits mailing list