[llvm-dev] [AArch64] Address computation folding

Wed Nov 11 14:23:51 PST 2015

Meador,
If you have a patch I would be interested in experimenting with it.

 Chad

> Hi,
>
> Indeed, the complex add is more expensive on all Cortex cores I know of.
>
> However there is an important point here that the code sequence we
> generate
> requires two registers live instead of one. In high regpressure loops,
> were
> probably losing performance.
>
> James
> On Wed, 11 Nov 2015 at 21:09, Tim Northover via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> On 11 November 2015 at 11:57, Meador Inge <meadori at gmail.com> wrote:
>> > Why wouldn't it consider the number of uses in any operation?  The
>> > "expected" code is easy to get by checking the number of uses.  This
>> > may be desirable on some micro-architectures depending on the cost of
>> > the various loads and stores.
>>
>> As you say, very microarchitecture-dependent. The code produced is
>> probably optimal for Cyclone ("[x0, x8]" is no more expensive than
>> "[x8]" and the "lsl" is slightly cheaper than the complicated "add").
>> If I'm reading the Cortex-A57 optimisation guide correctly, the same
>> reasoning applies there too.
>>
>> Cheers.
>>
>> Tim.
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>