Hi,<br><br>Indeed, the complex add is more expensive on all Cortex cores I know of. <br><br>However there is an important point here that the code sequence we generate requires two registers live instead of one. In high regpressure loops, were probably losing performance. <br><br>James<br><div class="gmail_quote"><div dir="ltr">On Wed, 11 Nov 2015 at 21:09, Tim Northover via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On 11 November 2015 at 11:57, Meador Inge <<a href="mailto:meadori@gmail.com" target="_blank">meadori@gmail.com</a>> wrote:<br>

> Why wouldn't it consider the number of uses in any operation?  The<br>

> "expected" code is easy to get by checking the number of uses.  This<br>

> may be desirable on some micro-architectures depending on the cost of<br>

> the various loads and stores.<br>

<br>

As you say, very microarchitecture-dependent. The code produced is<br>

probably optimal for Cyclone ("[x0, x8]" is no more expensive than<br>

"[x8]" and the "lsl" is slightly cheaper than the complicated "add").<br>

If I'm reading the Cortex-A57 optimisation guide correctly, the same<br>

reasoning applies there too.<br>

<br>

Cheers.<br>

<br>

Tim.<br>

_______________________________________________<br>

LLVM Developers mailing list<br>

<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>

<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>

</blockquote></div>