[llvm-commits] [llvm] r77557 - /llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp

Thu Jul 30 13:23:19 PDT 2009

On Jul 30, 2009, at 12:33 PM, Eli Friedman wrote:

> On Thu, Jul 30, 2009 at 7:56 AM, Bob Wilson<bob.wilson at apple.com>  
> wrote:
>> The default lowering is:
>>
>> scalar_to_vector -> v2f64
>> scalar_to_vector -> v2f64
>> vector_shuffle 0,1
>>
>> This requires 2 quad registers to hold the scalar_to_vector results.
>> I could get the shuffle for this case to be a simple register-to-
>> register copy of the subregs, and that would be a good thing to do,  
>> if
>> it doesn't already work.  But, aside from pattern-matching the entire
>> sequence above, I don't see how to do as well as what this
>> BUILD_VECTOR lowering does.  The f64 elements are stored in double
>> registers.  The insert_vector_elt operations are just subreg  
>> accesses,
>> and if the allocator puts the f64 values in the right place, you save
>> a quad register, compared to the default.
>
> Consider how you'd actually want to lower such a vector_shuffle in the
> general case: it would be an extract from one vector followed by an
> insert into the other, which compiles down to exactly the same thing,
> as far as I can tell.  It's not really a significant issue, though,
> just something I thought of when reading the commit.

OK.  I agree that it would be good to handle that case lowering the  
shuffle.  I will make a note to handle that.  The custom BUILD_VECTOR  
lowering is still useful because it saves on registers -- basically it  
avoids the scalar_to_vector operations.

>
>> By the way, the X86 lowering for BUILD_VECTOR does the same thing for
>> some vector types.  See LowerBuildVectorv16i8 and
>> LowerBuildVectorv8i16 in X86ISelLowering.cpp.
>
> The reason it does that is that the alternative is going through
> memory; legalization only does the vector_shuffle thing when there are
> only two source values.

Yeah, I remembered that later.  Thanks for your comments.