[llvm-commits] [PATCH] add-carray/sub-borrow optimization (CodeGen, X86)

Wed Oct 31 09:17:06 PDT 2012

Hi Shuxin,

A few comments:
+  if (CC == X86::COND_A) {
+    // Try to convert cond_a into cond_b in an attemp to facilitate 
Typo attempt
+    // materializing "setb reg"; see the following code.
+    //
+    // Do not flip "e > c", where "c" is a constant, because Cmp instruction
+    // cannot take an immedidate as its first operand.
Typo immediate
+    //
+    if (EFLAGS.getOpcode() == X86ISD::SUB && EFLAGS.hasOneUse() && 
+        EFLAGS.getValueType().isInteger() &&
+        !isa<ConstantSDNode>(EFLAGS.getOperand(1))) {
+      CC = X86::COND_B;
+      SDValue NewSub = DAG.getNode(X86ISD::SUB, EFLAGS.getDebugLoc(),
+                                   EFLAGS.getNode()->getVTList(),
+                                   EFLAGS.getOperand(1), EFLAGS.getOperand(0));
+      EFLAGS = SDValue(NewSub.getNode(), EFLAGS.getResNo());
+      SDValue NewVal = DAG.getNode(X86ISD::SETCC, DL, N->getVTList(),
+                                   DAG.getConstant(CC, MVT::i8), EFLAGS);
+      N = NewVal.getNode ();
Why not a single "if"?
Can we directly return a SETCC_CARRY instead of falling to the "if (CC == X86::COND_B)"?
Otherwise, looks good.

Thanks,
Manman
+    }
+  }
+

On Oct 30, 2012, at 3:37 PM, Shuxin Yang wrote:

> The motivating example:
> ========================
> 
>  The attached patch is to fix the performance defect reported in rdar://problem/12579915.
> The motivating example is:
> 
> -----------------------------------------
> int foo(unsigned x, unsigned y, int ret) {
>    if (x > y)
>        ++ret; /* or --ret */
>    return ret;
> }
> -----------------------------------------
> 
> Gcc gives:
>        movl    %edx, %eax // return val = 3rd parameter
>        cmpl    %edi, %esi // carry-bit = (y < x)
>        adcl    $0, %eax   // return val = 0 + carry-bit.
>        ret
> 
> and LLVM gave:
>        cmpl    %esi, %edi  // flags = x cmp y
>        seta    %al         // %al = 1 iff x > y
>        movzbl  %al, %eax   // cmp-result = zext (%al)
>        addl    %edx, %eax  // return-val = <ret> + cmp-result
>        ret
> 
> unsigned-less-than (ult) cmp has a nice feature: carray bit is set iff the cmp is satisified.
> Code-gen an take advantage of this feature to optimize expr like this:
>  (ult)  = x + 1
>  (ult)  = x - 1
> 
> 
> The Fix
> ========
> 
>  LLVM is already able to generate right code if the comparision is "<" (unsigned).
> So, this patch is simply to flip "x >u y" to "y <u x" in PerformSETCCCombine().
> 
>  One testing case is provied; and a "CHECK: sub" in another testing case is removed
> because it is irrelevant, and its position depends on scheduling.
> 
> TODO:
> =====
> 
>  1. With this patch, the assembly is:
>        cmpl    %edi, %esi
>        adcl    $0, %edx
>        movl    %edx, %eax
>    Compared to gcc, critial path has 3 instructions vs 2 instrs in gcc. As of I write
>    this mail, I have not yet got chance to investigate how gcc reduce the critial path.
>    Maybe it is just lucky.  This opt seems to be bit difficult.
> 
>  2. gcc is way better than llvm in this case:
> 
>    int test3(unsigned int x, unsigned int y, int res) {
>      if (x > 100)
>        res++;
>      return res;
>    }
> 
>    gcc give:
>        movl    %edx, %eax
>        cmpl    $101, %edi
>        sbbl    $-1, %eax
>        ret
> 
>    Gcc handles all these cases in ce_try_addcc() of if-conversion pass in @ifcvt.c,
>    while in llvm, the logic of such if-conv scatter in many places.
> 
>  3. With -m32, the instruction sequence is worse than -m64, I have not yet got chance
>     to dig the root cause.
> 
> <diff.patch>_______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits