[PATCH] Teach the DAGCombiner how to fold concat_vector nodes when the input is two BUILD_VECTOR nodes.

Thu Feb 6 00:53:29 PST 2014

This LGTM.

-bw

On Feb 4, 2014, at 11:48 AM, Robert Lougher <rob.lougher at gmail.com> wrote:

> ping.
> 
> On 28 January 2014 17:40, Robert Lougher <rob.lougher at gmail.com> wrote:
>> Hi,
>> 
>> This patch teaches the DAGCombiner how to fold concat_vector nodes
>> when the input is two BUILD_VECTOR nodes, e.g.:
>> 
>> (concat_vectors (BUILD_VECTOR a1, a2, a3, a4), (BUILD_VECTOR b1, b2, b3, b4))
>> ->
>> (BUILD_VECTOR a1, a2, a3, a4, b1, b2, b3, b4)
>> 
>> This can be seen with the following IR:
>> 
>> define <8 x float> @memory4(float* %p) {
>>  %1 = load float* %p, align 4
>>  %2 = insertelement <4 x float> undef, float %1, i32 0
>>  %3 = insertelement <4 x float> %2, float %1, i32 1
>>  %4 = insertelement <4 x float> %3, float %1, i32 2
>>  %5 = insertelement <4 x float> %4, float %1, i32 3
>>  %6 = shufflevector <4 x float> %5, <4 x float> undef, <8 x i32> <i32
>> 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3>
>> 
>> The shufflevector is turned into a concat_vectors node, and the
>> insertelements into a BUILD_VECTOR.
>> 
>> On AVX, without the patch this generates two instructions:
>> 
>> vbroadcastss (%rdi), %xmm0
>> vinsertf128 $1, %xmm0, %ymm0, %ymm0
>> 
>> With the patch:
>> 
>> vbroadcastss (%rdi), %ymm0
>> 
>> This is because the whole sequence was not recognized as a 256-bit
>> vbroadcast by LowerVectorBroadcast (X86ISelLowering.cpp) due to the
>> concat_vectors.
>> 
>> I have added tests to avx-vbroadcast.ll and avx2-vbroadcast.ll that
>> check both single and double vbroadcasts.  If the patch looks OK
>> please submit for me.
>> 
>> Thanks,
>> Rob.
>> 
>> --
>> Robert Lougher
>> SN Systems - Sony Computer Entertainment Group
> <patch.diff>_______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits