<div dir="ltr">Hi George,<div><br></div><div>Thanks for your answer!</div><div><br></div><div>I'm asking this question, because the feature -strict-aliasing can improve benchmark performance a lot. And even one of spec2000 benchmarks, I can see ~6% performance improvement.</div>
<div><br></div><div>I simplified one of the cases as below,</div><div><br></div><div>{code}</div><div><div>$ cat alias.c </div><div>typedef struct {int x; int y;} S;</div><div>S **ps;</div><div>int i;</div><div>main()</div>
<div>{</div><div> do {</div><div> ps[i]->x = i;</div><div> i++;</div><div> } while (i);</div><div>}</div><div>$ ~/llvm/build/bin/clang --target=aarch64-linux-gnuabi alias.c -S -O2</div><div>alias.c:4:1: warning: type specifier missing, defaults to 'int' [-Wimplicit-int]</div>
<div>main()</div><div>^</div><div>1 warning generated.</div><div>$ cat alias.s</div><div><span class="" style="white-space:pre"> </span>.text</div><div><span class="" style="white-space:pre"> </span>.file<span class="" style="white-space:pre"> </span>"alias.c"</div>
<div><span class="" style="white-space:pre"> </span>.globl<span class="" style="white-space:pre"> </span>main</div><div><span class="" style="white-space:pre"> </span>.align<span class="" style="white-space:pre"> </span>2</div>
<div><span class="" style="white-space:pre"> </span>.type<span class="" style="white-space:pre"> </span>main,@function</div><div>main: // @main</div><div>// BB#0: // %entry</div>
<div><span class="" style="white-space:pre"> </span>adrp<span class="" style="white-space:pre"> </span>x9, ps</div><div><span class="" style="white-space:pre"> </span>adrp<span class="" style="white-space:pre"> </span>x8, i</div>
<div><span class="" style="white-space:pre"> </span>ldr<span class="" style="white-space:pre"> </span>x9, [x9, :lo12:ps]</div><div><span class="" style="white-space:pre"> </span>ldr<span class="" style="white-space:pre"> </span>w10, [x8, :lo12:i]</div>
<div>.LBB0_1: // %do.body</div><div> // =>This Inner Loop Header: Depth=1</div><div><span class="" style="white-space:pre"> </span>ldr<span class="" style="white-space:pre"> </span>x11, [x9, w10, sxtw #3]</div>
<div><span class="" style="white-space:pre"> </span>str<span class="" style="white-space:pre"> </span> w10, [x11]</div><div><b><span class="" style="white-space:pre"> </span>ldr<span class="" style="white-space:pre"> </span>w10, [x8, :lo12:i] // inside the loop</b></div>
<div><span class="" style="white-space:pre"> </span>add<span class="" style="white-space:pre"> </span>w10, w10, #1 // =1</div><div><b><span class="" style="white-space:pre"> </span>str<span class="" style="white-space:pre"> </span>w10, [x8, :lo12:i] // inside the loop </b></div>
<div><span class="" style="white-space:pre"> </span>cbnz<span class="" style="white-space:pre"> </span>w10, .LBB0_1</div><div>// BB#2: // %do.end</div><div><span class="" style="white-space:pre"> </span>mov<span class="" style="white-space:pre"> </span> w0, wzr</div>
<div><span class="" style="white-space:pre"> </span>ret</div><div>.Ltmp1:</div><div><span class="" style="white-space:pre"> </span>.size<span class="" style="white-space:pre"> </span>main, .Ltmp1-main</div><div><br></div>
<div><span class="" style="white-space:pre"> </span>.type<span class="" style="white-space:pre"> </span>i,@object // @i</div><div><span class="" style="white-space:pre"> </span>.comm<span class="" style="white-space:pre"> </span>i,4,4</div>
<div><span class="" style="white-space:pre"> </span>.type<span class="" style="white-space:pre"> </span>ps,@object // @ps</div><div><span class="" style="white-space:pre"> </span>.comm<span class="" style="white-space:pre"> </span>ps,8,8</div>
<div><br></div><div><span class="" style="white-space:pre"> </span>.ident<span class="" style="white-space:pre"> </span>"clang version 3.6.0 "</div><div>$ aarch64-linux-gnu-gcc alias.c -S -O2</div><div>$ cat alias.s</div>
<div><span class="" style="white-space:pre"> </span>.cpu generic</div><div><span class="" style="white-space:pre"> </span>.file<span class="" style="white-space:pre"> </span>"alias.c"</div><div><span class="" style="white-space:pre"> </span>.section<span class="" style="white-space:pre"> </span>.text.startup,"ax",%progbits</div>
<div><span class="" style="white-space:pre"> </span>.align<span class="" style="white-space:pre"> </span>2</div><div><span class="" style="white-space:pre"> </span>.global<span class="" style="white-space:pre"> </span>main</div>
<div><span class="" style="white-space:pre"> </span>.type<span class="" style="white-space:pre"> </span>main, %function</div><div>main:</div><div><span class="" style="white-space:pre"> </span>adrp<span class="" style="white-space:pre"> </span>x4, i</div>
<div><span class="" style="white-space:pre"> </span>adrp<span class="" style="white-space:pre"> </span>x1, ps</div><div><b><span class="" style="white-space:pre"> </span>ldr<span class="" style="white-space:pre"> </span>w0, [x4,#:lo12:i] // hoisted out loop</b></div>
<div><span class="" style="white-space:pre"> </span>ldr<span class="" style="white-space:pre"> </span>x1, [x1,#:lo12:ps]</div><div><span class="" style="white-space:pre"> </span>add<span class="" style="white-space:pre"> </span>x1, x1, x0, sxtw 3</div>
<div><span class="" style="white-space:pre"> </span>b<span class="" style="white-space:pre"> </span>.L3</div><div>.L5:</div><div><span class="" style="white-space:pre"> </span>mov<span class="" style="white-space:pre"> </span>w0, w2</div>
<div>.L3:</div><div><span class="" style="white-space:pre"> </span>ldr<span class="" style="white-space:pre"> </span>x3, [x1],8</div><div><span class="" style="white-space:pre"> </span>adds<span class="" style="white-space:pre"> </span>w2, w0, 1</div>
<div><span class="" style="white-space:pre"> </span>str<span class="" style="white-space:pre"> </span>w0, [x3]</div><div><span class="" style="white-space:pre"> </span>bne<span class="" style="white-space:pre"> </span>.L5</div>
<div><span class="" style="white-space:pre"> </span><b>str<span class="" style="white-space:pre"> </span>w2, [x4,#:lo12:i] // sink out of loop</b></div><div><span class="" style="white-space:pre"> </span>ret</div><div><span class="" style="white-space:pre"> </span>.size<span class="" style="white-space:pre"> </span>main, .-main</div>
<div><span class="" style="white-space:pre"> </span>.comm<span class="" style="white-space:pre"> </span>i,4,4</div><div><span class="" style="white-space:pre"> </span>.comm<span class="" style="white-space:pre"> </span>ps,8,8</div>
<div><span class="" style="white-space:pre"> </span>.ident<span class="" style="white-space:pre"> </span>"GCC: (Ubuntu/Linaro 4.8.1-10ubuntu7) 4.8.1"</div></div><div>{code}</div><div><br></div><div>So for this case, gcc can hoist/sink load/store out of loop, but llvm can't.</div>
<div><br></div><div>Is the goal of your new design to replace the current "BasicAA+TBAA"?<br></div><div><br></div><div>Sorry for my off topic discussion, but I want to understand what is going to happen around this. Our stakeholders really want performance improvement.</div>
<div><br></div><div>Based on your last reply, my understanding is we should improve TBAA, but will it still work together with your new design? Should we pay effort on current TBAA pass to solve the problem?</div><div><br>
</div><div>Thanks,</div><div>-Jiangning</div><div><br></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">2014-08-08 2:40 GMT+08:00 George Burgess IV <span dir="ltr"><<a href="mailto:gbiv@google.com" target="_blank">gbiv@google.com</a>></span>:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi! Thanks for your comments :)<br>
<div class=""><br>
> Kevin posted this <a href="http://comments.gmane.org/gmane.comp.compilers.llvm.devel/75298" target="_blank">http://comments.gmane.org/gmane.comp.compilers.llvm.devel/75298</a>.<br>
</div>AFAICT, that's not a valid C program. The C89 standard section 3.1.2.6 says "All declarations that refer to the same object or function shall have compatible type; otherwise the behavior is undefined." In this case, the types int and `struct heap` are not compatible. So, undefined behavior results. For more on type compatibility in C89, see section 3.1.2.6 of the standard. Compiling the executable with `-fno-strict-aliasing -O2` on gcc 4.8.2 & clang 3.4.1, it seems that they both produce *very* similar assembly. In terms of pure instruction count, clang actually "wins" by a single instruction.<br>
<div class=""><br>
> Will this new design supporting "strict aliasing" finally?<br>
</div>This AA algorithm entirely ignores types at the moment. That being said, if we have an API that allows us to extract C/C++ types from LLVM IR, I don't see adding type sensitivity being a necessarily difficult task. (If you're only looking for a pass that deals with types, then you may find that the TypeBasedAliasAnalysis pass interesting.)<br>
<br>
Thanks again,<br>
George<br>
<br>
<a href="http://reviews.llvm.org/D4551" target="_blank">http://reviews.llvm.org/D4551</a><br>
<br>
<br>
</blockquote></div><br></div>