Thanks.<br>

<br>

OK, I wanted to simplify the example and it turns out that the backend optimizer is able to generate code without any multiplication.<br>

But what if the backend cannot derive an IR without multiplication ? 

What was a simple code at the beginning might become more complicated...<br>

<br>

I will simplify my example (but not that much) and post it on the mailing list.<br>

<br>

Damien<br>

<br><br><div class="gmail_quote">On Thu, Feb 24, 2011 at 3:33 PM, John McCall <span dir="ltr"><<a href="mailto:rjmccall@apple.com">rjmccall@apple.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">

<div><div></div><div class="h5">On Feb 24, 2011, at 3:25 PM, Damien Vincent wrote:<br>

> When I compile to llvm (using -emit-llvm -O2 -S) the following piece of code:<br>

><br>

> int f(int *ptr, int incr, int n)<br>

> {<br>

>   int r = n+1;<br>

><br>

>   do<br>

>   {<br>

>     if(*ptr!=0)<br>

>       r = n;<br>

>     ptr += incr;<br>

>     n--;<br>

>   } while(n!=0);<br>

>   return r;<br>

> }<br>

><br>

> clang notices the index can be generated using a multiplication ( ptr[incr * loopindex] ) and generates some code which is at least as complicated (a multiplication is more or at least as complex as an addition + the generated code uses indirect addressing instead of direct addressing which is sometimes more complex).<br>


> I am new to clang but I was thinking a compiler should do some strength reduction (and not the other way !).<br>

><br>

> Could you explain me why clang replaces this addition+direct addressing by a multiplication + indirect addressing ?<br>

<br>

</div></div>The optimizer is canonicalizing the loop, which the backend then lowers to a more efficient form:<br>

<br>

_f:                                     ## @f<br>

Leh_func_begin0:<br>

## BB#0:                                ## %entry<br>

        pushq   %rbp<br>

Ltmp0:<br>

        movq    %rsp, %rbp<br>

Ltmp1:<br>

        leal    -1(%rdx), %ecx<br>

        movslq  %esi, %rsi<br>

        shlq    $2, %rsi<br>

        incq    %rcx<br>

        leal    1(%rdx), %eax<br>

        .align  4, 0x90<br>

LBB0_1:                                 ## %do.body<br>

                                        ## =>This Inner Loop Header: Depth=1<br>

        cmpl    $0, (%rdi)<br>

        cmovnel %edx, %eax<br>

        addq    %rsi, %rdi<br>

        decl    %edx<br>

        decq    %rcx<br>

        jne     LBB0_1<br>

## BB#2:                                ## %do.end<br>

        popq    %rbp<br>

        ret<br>

<br>

LLVM's backend optimizers are quite powerful;  don't assume that the IR is the last word on the subject.<br>

<font color="#888888"><br>

John.</font></blockquote></div><br>