[PATCH] Teach the DAGCombiner how to fold concat_vector nodes when the input is two BUILD_VECTOR nodes.
Bill Wendling
isanbard at gmail.com
Thu Feb 6 00:53:29 PST 2014
This LGTM.
-bw
On Feb 4, 2014, at 11:48 AM, Robert Lougher <rob.lougher at gmail.com> wrote:
> ping.
>
> On 28 January 2014 17:40, Robert Lougher <rob.lougher at gmail.com> wrote:
>> Hi,
>>
>> This patch teaches the DAGCombiner how to fold concat_vector nodes
>> when the input is two BUILD_VECTOR nodes, e.g.:
>>
>> (concat_vectors (BUILD_VECTOR a1, a2, a3, a4), (BUILD_VECTOR b1, b2, b3, b4))
>> ->
>> (BUILD_VECTOR a1, a2, a3, a4, b1, b2, b3, b4)
>>
>> This can be seen with the following IR:
>>
>> define <8 x float> @memory4(float* %p) {
>> %1 = load float* %p, align 4
>> %2 = insertelement <4 x float> undef, float %1, i32 0
>> %3 = insertelement <4 x float> %2, float %1, i32 1
>> %4 = insertelement <4 x float> %3, float %1, i32 2
>> %5 = insertelement <4 x float> %4, float %1, i32 3
>> %6 = shufflevector <4 x float> %5, <4 x float> undef, <8 x i32> <i32
>> 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3>
>>
>> The shufflevector is turned into a concat_vectors node, and the
>> insertelements into a BUILD_VECTOR.
>>
>> On AVX, without the patch this generates two instructions:
>>
>> vbroadcastss (%rdi), %xmm0
>> vinsertf128 $1, %xmm0, %ymm0, %ymm0
>>
>> With the patch:
>>
>> vbroadcastss (%rdi), %ymm0
>>
>> This is because the whole sequence was not recognized as a 256-bit
>> vbroadcast by LowerVectorBroadcast (X86ISelLowering.cpp) due to the
>> concat_vectors.
>>
>> I have added tests to avx-vbroadcast.ll and avx2-vbroadcast.ll that
>> check both single and double vbroadcasts. If the patch looks OK
>> please submit for me.
>>
>> Thanks,
>> Rob.
>>
>> --
>> Robert Lougher
>> SN Systems - Sony Computer Entertainment Group
> <patch.diff>_______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
More information about the llvm-commits
mailing list