[llvm-commits] [llvm] r77557 - /llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp

Thu Jul 30 07:56:32 PDT 2009

On Jul 29, 2009, at 5:56 PM, Eli Friedman wrote:

> On Wed, Jul 29, 2009 at 5:41 PM, Eli  
> Friedman<eli.friedman at gmail.com> wrote:
>> On Wed, Jul 29, 2009 at 5:31 PM, Bob Wilson<bob.wilson at apple.com>  
>> wrote:
>>> Author: bwilson
>>> Date: Wed Jul 29 19:31:25 2009
>>> New Revision: 77557
>>>
>>> URL: http://llvm.org/viewvc/llvm-project?rev=77557&view=rev
>>> Log:
>>> Lower a 128-bit BUILD_VECTOR with 2 elements to a pair of  
>>> INSERT_VECTOR_ELTs.
>>
>> If this is somehow more efficient than the default lowering, perhaps
>> you should tweak the ARM lowering for VECTOR_SHUFFLE rather than
>> special-casing BUILD_VECTOR.
>
> Hmm, I didn't realize the default lowering explodes at the moment with
> "LLVM ERROR: Cannot yet select: 0xa226180: v2f64 = vector_shuffle".  I
> think the point still stands, though.

I definitely need to do more work on lowering shuffles, but I don't  
think I understand how to do what you are suggesting.

The default lowering is:

scalar_to_vector -> v2f64
scalar_to_vector -> v2f64
vector_shuffle 0,1

This requires 2 quad registers to hold the scalar_to_vector results.   
I could get the shuffle for this case to be a simple register-to- 
register copy of the subregs, and that would be a good thing to do, if  
it doesn't already work.  But, aside from pattern-matching the entire  
sequence above, I don't see how to do as well as what this  
BUILD_VECTOR lowering does.  The f64 elements are stored in double  
registers.  The insert_vector_elt operations are just subreg accesses,  
and if the allocator puts the f64 values in the right place, you save  
a quad register, compared to the default.

By the way, the X86 lowering for BUILD_VECTOR does the same thing for  
some vector types.  See LowerBuildVectorv16i8 and  
LowerBuildVectorv8i16 in X86ISelLowering.cpp.