[LLVMdev] Failure to optimize ? operator

Wed Dec 14 05:19:59 PST 2011

I don't understand your point.  Which version is better does NOT
depend on what inputs are passed to the function.  The compiled code
for (as per llvm) f1 will always take less time to execute than f2.

for x > 0 => T(f1) < T(f2)
for x <= 0 => T(f1) = T(f2)

where T() is the time to execute the given function.

So always T(f1) <= T(f2).

I would call this a missed optimization opportunity.  I think it
warrants a bug report.

If I do the same experiment with gcc I get identical code for the two functions:

==============================================
_f1:        pushl   %ebp        xorl    %eax, %eax        movl
%esp, %ebp        movl    8(%ebp), %edx        testl   %edx, %edx
  jle     L5        popl    %ebp        ret        .p2align 4,,7L5:
    movl    %edx, %ecx        imull   %edx, %ecx        popl    %ebp
     leal    3(%ecx,%ecx,4), %eax        imull   %edx, %eax
leal    1(%eax,%ecx,2), %eax        ret        .p2align 4,,15
_f2:
        pushl   %ebp        xorl    %eax, %eax        movl    %esp,
%ebp        movl    8(%ebp), %edx        testl   %edx, %edx        jle
    L9        popl    %ebp        ret        .p2align 4,,7L9:
movl    %edx, %ecx        imull   %edx, %ecx        popl    %ebp
 leal    3(%ecx,%ecx,4), %eax        imull   %edx, %eax        leal
1(%eax,%ecx,2), %eax
ret==============================================

Brent

On Wed, Dec 14, 2011 at 9:58 AM, Eli Friedman <eli.friedman at gmail.com> wrote:
> On Tue, Dec 13, 2011 at 5:59 AM, Brent Walker <brenthwalker at gmail.com> wrote:
>> The following seemingly identical functions, get compiled to quite
>> different machine code.  The first is correctly optimized (the
>> computation of var y is nicely moved into the else branch of the "if"
>> statement), which the second one is not (the full computation of var y
>> is always done).  The output was produced using the demo page on
>> llvm's web site (optimization level LTO).
>>
>> Can someone shed some light on this strange behavior?
>
> The IR you're seeing is the IR we naturally generate for the given IR,
> and LLVM doesn't have an optimization to turn your f2 into f1.  Which
> version is better depends on the values passed into the function,
> which makes it a bit tricky to write an optimization to turn one into
> the other.
>
> -Eli