[llvm-commits] [llvm] r77557 - /llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp

Thu Jul 30 12:33:07 PDT 2009

On Thu, Jul 30, 2009 at 7:56 AM, Bob Wilson<bob.wilson at apple.com> wrote:
> The default lowering is:
>
> scalar_to_vector -> v2f64
> scalar_to_vector -> v2f64
> vector_shuffle 0,1
>
> This requires 2 quad registers to hold the scalar_to_vector results.
> I could get the shuffle for this case to be a simple register-to-
> register copy of the subregs, and that would be a good thing to do, if
> it doesn't already work.  But, aside from pattern-matching the entire
> sequence above, I don't see how to do as well as what this
> BUILD_VECTOR lowering does.  The f64 elements are stored in double
> registers.  The insert_vector_elt operations are just subreg accesses,
> and if the allocator puts the f64 values in the right place, you save
> a quad register, compared to the default.

Consider how you'd actually want to lower such a vector_shuffle in the
general case: it would be an extract from one vector followed by an
insert into the other, which compiles down to exactly the same thing,
as far as I can tell.  It's not really a significant issue, though,
just something I thought of when reading the commit.

> By the way, the X86 lowering for BUILD_VECTOR does the same thing for
> some vector types.  See LowerBuildVectorv16i8 and
> LowerBuildVectorv8i16 in X86ISelLowering.cpp.

The reason it does that is that the alternative is going through
memory; legalization only does the vector_shuffle thing when there are
only two source values.

-Eli