[llvm-commits] [llvm] r77557 - /llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp
Bob Wilson
bob.wilson at apple.com
Thu Jul 30 07:56:32 PDT 2009
On Jul 29, 2009, at 5:56 PM, Eli Friedman wrote:
> On Wed, Jul 29, 2009 at 5:41 PM, Eli
> Friedman<eli.friedman at gmail.com> wrote:
>> On Wed, Jul 29, 2009 at 5:31 PM, Bob Wilson<bob.wilson at apple.com>
>> wrote:
>>> Author: bwilson
>>> Date: Wed Jul 29 19:31:25 2009
>>> New Revision: 77557
>>>
>>> URL: http://llvm.org/viewvc/llvm-project?rev=77557&view=rev
>>> Log:
>>> Lower a 128-bit BUILD_VECTOR with 2 elements to a pair of
>>> INSERT_VECTOR_ELTs.
>>
>> If this is somehow more efficient than the default lowering, perhaps
>> you should tweak the ARM lowering for VECTOR_SHUFFLE rather than
>> special-casing BUILD_VECTOR.
>
> Hmm, I didn't realize the default lowering explodes at the moment with
> "LLVM ERROR: Cannot yet select: 0xa226180: v2f64 = vector_shuffle". I
> think the point still stands, though.
I definitely need to do more work on lowering shuffles, but I don't
think I understand how to do what you are suggesting.
The default lowering is:
scalar_to_vector -> v2f64
scalar_to_vector -> v2f64
vector_shuffle 0,1
This requires 2 quad registers to hold the scalar_to_vector results.
I could get the shuffle for this case to be a simple register-to-
register copy of the subregs, and that would be a good thing to do, if
it doesn't already work. But, aside from pattern-matching the entire
sequence above, I don't see how to do as well as what this
BUILD_VECTOR lowering does. The f64 elements are stored in double
registers. The insert_vector_elt operations are just subreg accesses,
and if the allocator puts the f64 values in the right place, you save
a quad register, compared to the default.
By the way, the X86 lowering for BUILD_VECTOR does the same thing for
some vector types. See LowerBuildVectorv16i8 and
LowerBuildVectorv8i16 in X86ISelLowering.cpp.
More information about the llvm-commits
mailing list