[llvm] r248250 - [X86][SSE] Match zero/any extension shuffles that don't start from the first element

Jeroen Ketema via llvm-commits llvm-commits at lists.llvm.org
Thu Sep 24 01:59:00 PDT 2015



Earlier: plain x86 64-bit without any additional features enabled.

Jeroen

On 24/09/15 09:44, Simon Pilgrim wrote:
> Looking at this now - was this targeting AVX2 or something earlier?
>
>> On 23 Sep 2015, at 17:29, Jeroen Ketema <jeroen at codeplay.com> wrote:
>>
>>
>> Hi Simon,
>>
>> This change breaks a number of shuffles for me. For example,
>>
>> shufflevector <16 x i16> %0, <16 x i16> undef, <16 x i32> <i32 10, i32 3, i32 13, i32 3, i32 8, i32 14, i32 13, i32 14, i32 5, i32 7, i32 12, i32 2, i32 13, i32 6, i32 3, i32 6>
>>
>> Now shuffles to
>>
>> 10, 3, 13, 3, 8, 14, 13, 14, 5, 7, 10, 2, 11, 6, 3, 6
>>
>> so the 10th and 12th element are wrong.
>>
>> Best,
>>
>> Jeroen
>>
>> On 22/09/15 09:16, llvm-commits at lists.llvm.org (Simon Pilgrim via llvm-commits) wrote:
>>> Author: rksimon
>>> Date: Tue Sep 22 03:16:08 2015
>>> New Revision: 248250
>>>
>>> URL: http://llvm.org/viewvc/llvm-project?rev=248250&view=rev
>>> Log:
>>> [X86][SSE] Match zero/any extension shuffles that don't start from the first element
>>>
>>> This patch generalizes the lowering of shuffles as zero extensions to allow extensions that don't start from the first element. It now recognises extensions starting anywhere in the lower 128-bits or at the start of any higher 128-bit lane.
>>>
>>> The motivation was to reduce the number of high cost pshufb calls, but it also improves the SSE2 case as well.
>>>
>>> Differential Revision: http://reviews.llvm.org/D12561
>>>
>>> Modified:
>>>      llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
>>>      llvm/trunk/test/CodeGen/X86/machine-cp.ll
>>>      llvm/trunk/test/CodeGen/X86/vec_cast2.ll
>>>      llvm/trunk/test/CodeGen/X86/vec_int_to_fp.ll
>>>      llvm/trunk/test/CodeGen/X86/vector-sext.ll
>>>      llvm/trunk/test/CodeGen/X86/vector-shuffle-256-v16.ll
>>>      llvm/trunk/test/CodeGen/X86/vector-shuffle-256-v32.ll
>>>      llvm/trunk/test/CodeGen/X86/vector-shuffle-sse4a.ll
>>>      llvm/trunk/test/CodeGen/X86/vector-zext.ll
>>>



More information about the llvm-commits mailing list