[llvm] r248250 - [X86][SSE] Match zero/any extension shuffles that don't start from the first element

Jeroen Ketema via llvm-commits llvm-commits at lists.llvm.org
Thu Sep 24 08:02:46 PDT 2015


Hi Simon,

I also see failures when I enable AVX2, but it might be that these are 
not be related to your changes. I hadn't run with AVX2 enabled for a 
long time. It might be worth looking in to these issues once the below 
issue is fixed.

Jeroen

On 24/09/15 09:44, Simon Pilgrim wrote:
> Looking at this now - was this targeting AVX2 or something earlier?
>
>> On 23 Sep 2015, at 17:29, Jeroen Ketema <jeroen at codeplay.com> wrote:
>>
>>
>> Hi Simon,
>>
>> This change breaks a number of shuffles for me. For example,
>>
>> shufflevector <16 x i16> %0, <16 x i16> undef, <16 x i32> <i32 10, i32 3, i32 13, i32 3, i32 8, i32 14, i32 13, i32 14, i32 5, i32 7, i32 12, i32 2, i32 13, i32 6, i32 3, i32 6>
>>
>> Now shuffles to
>>
>> 10, 3, 13, 3, 8, 14, 13, 14, 5, 7, 10, 2, 11, 6, 3, 6
>>
>> so the 10th and 12th element are wrong.
>>
>> Best,
>>
>> Jeroen
>>
>> On 22/09/15 09:16, llvm-commits at lists.llvm.org (Simon Pilgrim via llvm-commits) wrote:
>>> Author: rksimon
>>> Date: Tue Sep 22 03:16:08 2015
>>> New Revision: 248250
>>>
>>> URL: http://llvm.org/viewvc/llvm-project?rev=248250&view=rev
>>> Log:
>>> [X86][SSE] Match zero/any extension shuffles that don't start from the first element
>>>
>>> This patch generalizes the lowering of shuffles as zero extensions to allow extensions that don't start from the first element. It now recognises extensions starting anywhere in the lower 128-bits or at the start of any higher 128-bit lane.
>>>
>>> The motivation was to reduce the number of high cost pshufb calls, but it also improves the SSE2 case as well.
>>>
>>> Differential Revision: http://reviews.llvm.org/D12561
>>>
>>> Modified:
>>>      llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
>>>      llvm/trunk/test/CodeGen/X86/machine-cp.ll
>>>      llvm/trunk/test/CodeGen/X86/vec_cast2.ll
>>>      llvm/trunk/test/CodeGen/X86/vec_int_to_fp.ll
>>>      llvm/trunk/test/CodeGen/X86/vector-sext.ll
>>>      llvm/trunk/test/CodeGen/X86/vector-shuffle-256-v16.ll
>>>      llvm/trunk/test/CodeGen/X86/vector-shuffle-256-v32.ll
>>>      llvm/trunk/test/CodeGen/X86/vector-shuffle-sse4a.ll
>>>      llvm/trunk/test/CodeGen/X86/vector-zext.ll
>>>



More information about the llvm-commits mailing list