[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
Quentin Colombet
qcolombet at apple.com
Thu Oct 2 13:41:47 PDT 2014
Hi Chandler,
I’ve filed a few PRs regarding the latest regressions I found.
Here are the links if you want the details.
http://llvm.org/bugs/show_bug.cgi?id=21137 <http://llvm.org/bugs/show_bug.cgi?id=21137>
http://llvm.org/bugs/show_bug.cgi?id=21138 <http://llvm.org/bugs/show_bug.cgi?id=21138>
http://llvm.org/bugs/show_bug.cgi?id=21139 <http://llvm.org/bugs/show_bug.cgi?id=21139>
http://llvm.org/bugs/show_bug.cgi?id=21140 <http://llvm.org/bugs/show_bug.cgi?id=21140>
I've already reported the first one a while back.
This is just FYI, I do not expect you to handle all the work :).
Cheers,
-Quentin
> On Oct 1, 2014, at 11:24 AM, Andrea Di Biagio <andrea.dibiagio at gmail.com> wrote:
>
> Hi Chandler,
>
> Not sure how important this can be, however I found a minor regression
> with the new shuffle lowering.
> Here is a reproducible test case:
>
> ;;
> define <4 x i32> @test(<4 x i32> %V) {
> %1 = shufflevector <4 x i32> %V, <4 x i32> <i32 0, i32 0, i32 0, i32
> 0>, <4 x i32> <i32 0, i32 1, i32 4, i32 5>
> ret <4 x i32> %1
> }
> ;;
>
> $ llc -mcpu=corei7-avx -o -
>
> vmovq %xmm0, %xmm0
> retq
>
> $ llc -mcpu=corei7-avx -x86-experimental-vector-shuffle-lowering -o -
> vpxor %xmm1, %xmm1, %xmm1
> vpunpcklqdq %xmm1, %xmm0, %xmm0 # xmm0 = xmm0[0],xmm1[0]
> retq
>
> If we know that the upper 64-bits of the destination register are
> zero, we can try to emit a simpler vmovq instead of a vxor+vunpck.
>
> As I said, this is a minor issue.
> I just wanted to post this finding so that we don't forget about it.
>
> Cheers,
> Andrea
>
> On Wed, Oct 1, 2014 at 9:23 AM, Andrea Di Biagio
> <andrea.dibiagio at gmail.com <mailto:andrea.dibiagio at gmail.com>> wrote:
>> On Wed, Oct 1, 2014 at 1:52 AM, Chandler Carruth <chandlerc at google.com> wrote:
>>> This has been added in r218724.
>> Thanks Chandler!
>>
>>> Based on the feedback here and from Quentin, I'm going to email the list
>>> shortly with a heads-up, and then flip the default over to the new shuffle
>>> lowering.
>>
>> Nice.
>> Again, thanks for working on this!
>>
>> -Andrea
>>
>>>
>>> On Mon, Sep 29, 2014 at 10:48 PM, Chandler Carruth <chandlerc at google.com>
>>> wrote:
>>>>
>>>> Wow. Somehow, I forgot about vbroadcast and vpbroadcast. =[ Sorry about
>>>> that. I'll fix those.
>>>>
>>>> On Fri, Sep 26, 2014 at 3:39 AM, Andrea Di Biagio
>>>> <andrea.dibiagio at gmail.com> wrote:
>>>>>
>>>>> Hi Chandler,
>>>>>
>>>>> Here is another test.
>>>>>
>>>>> When looking at the AVX codegen, I noticed that, when using the new
>>>>> shuffle lowering, we no longer emit a single vbroadcastss in the case
>>>>> where the shuffle performs a splat of a scalar float loaded from
>>>>> memory.
>>>>>
>>>>> For example:
>>>>> (with -mcpu=corei7-avx -x86-experimental-vector-shuffle-lowering)
>>>>> vmovss (%rdi), %xmm0
>>>>> vpermilps $0, %xmm0, %xmm0 # xmm0 = xmm0[0,0,0,0]
>>>>>
>>>>> Instead of:
>>>>> (with -mcpu=corei7-avx)
>>>>> vbroadcastss (%rdi), %xmm0
>>>>>
>>>>> I have attached a small reproducible for it.
>>>>>
>>>>> Basically, the old shuffle lowering logic calls function
>>>>> 'NormalizeVectorShuffle' to handle shuffles that perform a splat
>>>>> operation.
>>>>> On AVX, function 'NormalizeVectorShuffle' tries to lower a splat where
>>>>> the splat value comes from a load into a X86ISD::VBROADCAST dag node.
>>>>> Later on, during instruction selection, we emit a single avx_broadcast
>>>>> for the load+splat sequence (basically, we end up folding the load in
>>>>> the operand of the vbroadcastss).
>>>>>
>>>>> What happens is that the new shuffle lowering doesn't emit a
>>>>> vbroadcast node in this case and eventually we end up selecting the
>>>>> sequence of vmovss+vpermilps.
>>>>>
>>>>> I hope this helps.
>>>>> Andrea
>>>>>
>>>>> On Tue, Sep 23, 2014 at 10:53 PM, Chandler Carruth <chandlerc at google.com>
>>>>> wrote:
>>>>>>
>>>>>> On Tue, Sep 23, 2014 at 2:35 PM, Simon Pilgrim <llvm-dev at redking.me.uk>
>>>>>> wrote:
>>>>>>>
>>>>>>> If you don’t want to spend time on this, I’d be happy to create a
>>>>>>> candidate patch for review? I’ve been unclear if you were taking
>>>>>>> patches for
>>>>>>> your shuffle work prior to it becoming the default.
>>>>>>
>>>>>>
>>>>>> While I'm happy to work on it, I'm even more happy to have patches. =D
>>>>>>
>>>>>> _______________________________________________
>>>>>> LLVM Developers mailing list
>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>
>>>>
>>>>
>>>
> <test.ll>_______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu> http://llvm.cs.uiuc.edu <http://llvm.cs.uiuc.edu/>
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141002/c9dd2cab/attachment.html>
More information about the llvm-dev
mailing list