[PATCH][ARM] Improve the instruction selection of vector loads.

Quentin Colombet qcolombet at apple.com
Wed Jul 3 14:43:22 PDT 2013


Thanks Jim.

Committed as r185587.

-Quentin

On Jul 3, 2013, at 1:42 PM, Jim Grosbach <grosbach at apple.com> wrote:

> LGTM, thanks!
> 
> -Jim
> On Jul 3, 2013, at 10:09 AM, Quentin Colombet <qcolombet at apple.com> wrote:
> 
>> Ping?
>> 
>> -Quentin
>> 
>> On Jul 1, 2013, at 5:42 PM, Quentin Colombet <qcolombet at apple.com> wrote:
>> 
>>> Hi,
>>> 
>>> Here is a patch to improve the instruction selection of vector loads on ARM.
>>> Thanks for your review.
>>> 
>>> ** Problematic **
>>> In the ARM back-end, build_vector nodes are lowered to a target specific build_vector that uses floating point type[1]. 
>>> This works well, unless the inserted bitcasts survive until instruction selection. In that case, they incur moves between integer unit and floating point unit that may result in inefficient code.
>>> 
>>> In other words, this conversion may introduce artificial dependencies when the code leading to the build vector cannot be completed with a floating point type.
>>> 
>>> In particular, this happens when loads are not aligned.
>>> 
>>> In that case, the compiler generates general purpose loads and creates the floating point vector from them, instead of directly using the vector unit.
>>> 
>>> <rdar://problem/14170854>
>>> 
>>> [1] The rational is that floating point registers are aliases of vector registers.
>>> 
>>> ** Motivating Example **
>>> The attached motivating_example.ll demonstrates that (also part of a test case in the proposed patch).
>>> 
>>> To reproduce:
>>> llc -O3 -mtriple thumbv7-apple-ios3 motivating_example.ll -o -
>>> 	ldr	r0, [r1]
>>> 	ldr	r1, [r2]
>>> 	vmov	s1, r1
>>> 	vmov	s0, r0
>>> Here each ldr, vmov sequences could have been replaced by a simple vld1.32.
>>> 
>>> ** Proposed Solution **
>>> Use a vector friendly sequence of code when the inserted bitcasts to floating point survived DAGCombine.
>>> 
>>> Thanks for Eli Friedman for the direction!
>>> 
>>> This is done by a target specific DAGCombine that changes the target specific build_vector into a sequence of insert_vector_elt that get rid of the bitcasts.
>>> 
>>> Thanks for your review.
>>> 
>>> Cheers,
>>> 
>>> -Quentin
>>> <ARMISelLowering.patch> <motivating_example.ll> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>> 
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130703/8aa96ec9/attachment.html>


More information about the llvm-commits mailing list