[PATCH] D14496: X86: More efficient codegen for 64-bit compare-and-branch

Mon Nov 9 05:36:58 PST 2015

mkuper added a comment.

Hi Hans,

For the eq version, I find it a bit surprising that the new code is faster, but if benchmarksing says it is, who am I to argue. Adding Dave to the review for another opinion.

For the lt version, the new code definitely looks better than what we had before. It looks like there's another option though, which is more similar in spirit to the old code than to the new, but looks much nicer. This is what ICC produces:

  test(long long, long long):
          movl      4(%esp), %eax 
          subl      12(%esp), %eax
          movl      8(%esp), %edx 
          sbbl      16(%esp), %edx
          jge       ..B1.3        
          movl      $1, %eax      
          ret                     
  ..B1.3:                         
          movl      $2, %eax      
          ret                     

Do you think it may be worth lowering the new pseudo to that, instead of the proposed sequence?

Michael

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:15235
@@ +15234,3 @@
+                      SDValue &Lo2, unsigned &CC) {
+  // This function is pattern matching for the output of
+  // DAGTypeLegalizer::IntegerExpandSetCCOperands.
----------------
I'm not a huge fan of this, but producing the pseudo in a target-specific pre-legalization DAG combine sounds like it may cause too many problems due to making the comparison opaque to other early combines. 

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:21275
@@ +21274,3 @@
+  MachineInstr *FalseJmp = ++MachineBasicBlock::iterator(MI);
+  assert(FalseJmp->getOpcode() == X86::JMP_1);
+
----------------
Why is this guaranteed?

http://reviews.llvm.org/D14496