[PATCH] X86: fold SSE2/AVX2 logical shift by immediate amout into zero vector when possible
Nadav Rotem
nrotem at apple.com
Wed Jul 10 13:15:33 PDT 2013
The patch LGTM. I have a few comments:
This is a NOP:
+define <8 x i16> @test_srlw_1(<8 x i16> %InVec) {
+entry:
+ %shl = lshr <8 x i16> %InVec, <i16 0, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0>
+ ret <8 x i16> %shl
+}
+
+; CHECK: test_srlw_1:
+; CHECK: psrlw $0, %xmm0
+; CHECK-NEXT: ret
+
I think that this is also a missed optimization. 32 > 31.
+define <4 x i32> @test_srad_3(<4 x i32> %InVec) {
+entry:
+ %shl = ashr <4 x i32> %InVec, <i32 32, i32 32, i32 32, i32 32>
+ ret <4 x i32> %shl
+}
+
+; CHECK: test_srad_3:
+; CHECK: psrad $32, %xmm0
+; CHECK-NEXT: ret
Nadav
On Jul 10, 2013, at 1:00 PM, Eric Christopher <echristo at gmail.com> wrote:
> Nadav might be someone good to review this.
>
> -eric
>
> On Wed, Jul 10, 2013 at 6:46 AM, <Andrea_DiBiagio at sn.scee.net> wrote:
>> Ping.
>>
>> (See attached file: patch.diff)
>> Andrea DiBiagio/SN R&D/BS/UK/SCEE wrote on 01/07/2013 12:01:44:
>>
>>> Friendly ping.
>>>
>>>> From: Andrea DiBiagio/SN R&D/BS/UK/SCEE
>>>> Hi all,
>>>>
>>>> I'd like to contribute a patch that teaches the x86 backend how to
>>>> combine SSE2/AVX2 packed logical shifts by immediate amount into
>>>> vectors of all 0s.
>>>>
>>>> SSE2/AVX2 logical shift by immediate amount where the amount is
>>>> greater than or
>>>> equal to the vector element size always return a vector of all 0s.
>>>>
>>>> Example:
>>>> pslld $35, %xmm0 # SSE2 packed doubleword logical shift left.
>>>> # %xmm0 is a vector of packed int (MVT::v4i32).
>>>>
>>>> The shift from this example will return a vector of all zeros in %xmm0
>> and
>>>> therefore it could be easily rewritten for example as:
>>>> xorps %xmm0, %xmm0
>>>>
>>>> This patch adds a new target combine rule in X86ISelLowering.cpp to
>>>> make sure that we simplify when possible vector shifts into zero
>> vectors.
>>>>
>>>> I added two new tests to verify that vector shifts are correctly folded
>> into
>>>> vectors of all 0s when the immediate amount is equal or exceeds
>>>> the vector element size.
>>>>
>>>> Thanks,
>>>> Andrea Di Biagio
>>>> SN Systems - Sony Computer Entertainment
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130710/244f4e38/attachment.html>
More information about the llvm-commits
mailing list