[PATCH] X86: fold SSE2/AVX2 logical shift by immediate amout into zero vector when possible

Nadav Rotem nrotem at apple.com
Wed Jul 10 13:15:33 PDT 2013


The patch LGTM.  I have a few comments:

This is a NOP:

+define <8 x i16> @test_srlw_1(<8 x i16> %InVec) {
+entry:
+  %shl = lshr <8 x i16> %InVec, <i16 0, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0>
+  ret <8 x i16> %shl
+}
+
+; CHECK: test_srlw_1:
+; CHECK: psrlw   $0, %xmm0
+; CHECK-NEXT: ret
+

I think that this is also a missed optimization.  32 > 31. 

+define <4 x i32> @test_srad_3(<4 x i32> %InVec) {
+entry:
+  %shl = ashr <4 x i32> %InVec, <i32 32, i32 32, i32 32, i32 32>
+  ret <4 x i32> %shl
+}
+
+; CHECK: test_srad_3:
+; CHECK: psrad   $32, %xmm0
+; CHECK-NEXT: ret


Nadav

On Jul 10, 2013, at 1:00 PM, Eric Christopher <echristo at gmail.com> wrote:

> Nadav might be someone good to review this.
> 
> -eric
> 
> On Wed, Jul 10, 2013 at 6:46 AM,  <Andrea_DiBiagio at sn.scee.net> wrote:
>> Ping.
>> 
>> (See attached file: patch.diff)
>> Andrea DiBiagio/SN R&D/BS/UK/SCEE wrote on 01/07/2013 12:01:44:
>> 
>>> Friendly ping.
>>> 
>>>> From: Andrea DiBiagio/SN R&D/BS/UK/SCEE
>>>> Hi all,
>>>> 
>>>> I'd like to contribute a patch that teaches the x86 backend how to
>>>> combine SSE2/AVX2 packed logical shifts by immediate amount into
>>>> vectors of all 0s.
>>>> 
>>>> SSE2/AVX2 logical shift by immediate amount where the amount is
>>>> greater than or
>>>> equal to the vector element size always return a vector of all 0s.
>>>> 
>>>> Example:
>>>> pslld $35, %xmm0   # SSE2 packed doubleword logical shift left.
>>>>                    # %xmm0 is a vector of packed int (MVT::v4i32).
>>>> 
>>>> The shift from this example will return a vector of all zeros in %xmm0
>> and
>>>> therefore it could be easily rewritten for example as:
>>>> xorps %xmm0, %xmm0
>>>> 
>>>> This patch adds a new target combine rule in X86ISelLowering.cpp to
>>>> make sure that we simplify when possible vector shifts into zero
>> vectors.
>>>> 
>>>> I added two new tests to verify that vector shifts are correctly folded
>> into
>>>> vectors of all 0s when the immediate amount is equal or exceeds
>>>> the vector element size.
>>>> 
>>>> Thanks,
>>>> Andrea Di Biagio
>>>> SN Systems - Sony Computer Entertainment
>> 
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130710/244f4e38/attachment.html>


More information about the llvm-commits mailing list