[PATCH] D11393: [X86] Allow X86::COND_NE_OR_P and X86::COND_NP_OR_E to be reversed.
Cong Hou via llvm-commits
llvm-commits at lists.llvm.org
Thu Jan 28 11:32:42 PST 2016
congh added a comment.
Thanks for the review, David!
Comment at: llvm/trunk/lib/Target/X86/X86InstrInfo.cpp:4033
@@ -4033,3 +4032,3 @@
X86::CondCode OldBranchCode = (X86::CondCode)Cond.getImm();
if (OldBranchCode == BranchCode)
> Don't you need to check TBB == NewTBB here also?
Yes, you are right.
Comment at: llvm/trunk/lib/Target/X86/X86InstrInfo.cpp:4071
@@ +4070,3 @@
+ // See comments above for X86::COND_P_AND_NE.
+ BranchCode = X86::COND_E_AND_NP;
+ } else
> I think you need to verify here that NewTBB == FBB before making this transformation. If FBB is nullptr, you'll need to compute the fallthrough block and check that it matches NewTBB.
When FBB is nullptr, we could not safely let the fallthrough block be FBB. This is because there is a use case of AnalyzeBranch in block-placement where MBBs are reordered before this function is called, in which case the fallthrough MBB may have nothing to do with the branch.
But we can do this check if FBB is not null.
Comment at: llvm/trunk/lib/Target/X86/X86InstrInfo.h:57
@@ +56,3 @@
> The COND_NP_OR_E condition is rather pointless. Based on the comment, it is pretty clear that the intent was for this to be an artificial condition for FCMP_OEQ, but that should be COND_NP_AND_E. (AND not OR)
> In other words, I'd recommend deleting the existing COND_NP_OR_E condition and the COND_P_AND_NE one that you added. COND_NE_OR_P and its inverse COND_E_AND_NP are sufficient.
> [FWIW, COND_NP_OR_E would always evaluate to true assuming the CC's originated from an FP compare instruction like COMISS.]
This makes sense. Done.
Comment at: llvm/trunk/test/CodeGen/X86/fp-une-cmp.ll:51
@@ +50,3 @@
+; CHECK: %bb1
+; CHECK: jmp
> If we invert the compound branch at the end of the entry block and place bb1 before bb2, we can eliminate the jmp at the end of bb1. Do you know why that isn't happening?
This is because bb2 is hotter than bb1 (note that there is a branch_weights profile data), and the BlockPlacement pass will place the hotter one as a fall-through.
More information about the llvm-commits