[llvm-commits] [PATCH] add-carray/sub-borrow optimization (CodeGen, X86)
Manman Ren
mren at apple.com
Wed Oct 31 09:17:06 PDT 2012
Hi Shuxin,
A few comments:
+ if (CC == X86::COND_A) {
+ // Try to convert cond_a into cond_b in an attemp to facilitate
Typo attempt
+ // materializing "setb reg"; see the following code.
+ //
+ // Do not flip "e > c", where "c" is a constant, because Cmp instruction
+ // cannot take an immedidate as its first operand.
Typo immediate
+ //
+ if (EFLAGS.getOpcode() == X86ISD::SUB && EFLAGS.hasOneUse() &&
+ EFLAGS.getValueType().isInteger() &&
+ !isa<ConstantSDNode>(EFLAGS.getOperand(1))) {
+ CC = X86::COND_B;
+ SDValue NewSub = DAG.getNode(X86ISD::SUB, EFLAGS.getDebugLoc(),
+ EFLAGS.getNode()->getVTList(),
+ EFLAGS.getOperand(1), EFLAGS.getOperand(0));
+ EFLAGS = SDValue(NewSub.getNode(), EFLAGS.getResNo());
+ SDValue NewVal = DAG.getNode(X86ISD::SETCC, DL, N->getVTList(),
+ DAG.getConstant(CC, MVT::i8), EFLAGS);
+ N = NewVal.getNode ();
Why not a single "if"?
Can we directly return a SETCC_CARRY instead of falling to the "if (CC == X86::COND_B)"?
Otherwise, looks good.
Thanks,
Manman
+ }
+ }
+
On Oct 30, 2012, at 3:37 PM, Shuxin Yang wrote:
> The motivating example:
> ========================
>
> The attached patch is to fix the performance defect reported in rdar://problem/12579915.
> The motivating example is:
>
> -----------------------------------------
> int foo(unsigned x, unsigned y, int ret) {
> if (x > y)
> ++ret; /* or --ret */
> return ret;
> }
> -----------------------------------------
>
> Gcc gives:
> movl %edx, %eax // return val = 3rd parameter
> cmpl %edi, %esi // carry-bit = (y < x)
> adcl $0, %eax // return val = 0 + carry-bit.
> ret
>
> and LLVM gave:
> cmpl %esi, %edi // flags = x cmp y
> seta %al // %al = 1 iff x > y
> movzbl %al, %eax // cmp-result = zext (%al)
> addl %edx, %eax // return-val = <ret> + cmp-result
> ret
>
> unsigned-less-than (ult) cmp has a nice feature: carray bit is set iff the cmp is satisified.
> Code-gen an take advantage of this feature to optimize expr like this:
> (ult) = x + 1
> (ult) = x - 1
>
>
> The Fix
> ========
>
> LLVM is already able to generate right code if the comparision is "<" (unsigned).
> So, this patch is simply to flip "x >u y" to "y <u x" in PerformSETCCCombine().
>
> One testing case is provied; and a "CHECK: sub" in another testing case is removed
> because it is irrelevant, and its position depends on scheduling.
>
> TODO:
> =====
>
> 1. With this patch, the assembly is:
> cmpl %edi, %esi
> adcl $0, %edx
> movl %edx, %eax
> Compared to gcc, critial path has 3 instructions vs 2 instrs in gcc. As of I write
> this mail, I have not yet got chance to investigate how gcc reduce the critial path.
> Maybe it is just lucky. This opt seems to be bit difficult.
>
> 2. gcc is way better than llvm in this case:
>
> int test3(unsigned int x, unsigned int y, int res) {
> if (x > 100)
> res++;
> return res;
> }
>
> gcc give:
> movl %edx, %eax
> cmpl $101, %edi
> sbbl $-1, %eax
> ret
>
> Gcc handles all these cases in ce_try_addcc() of if-conversion pass in @ifcvt.c,
> while in llvm, the logic of such if-conv scatter in many places.
>
> 3. With -m32, the instruction sequence is worse than -m64, I have not yet got chance
> to dig the root cause.
>
> <diff.patch>_______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
More information about the llvm-commits
mailing list