[PATCH] Teach the DAGCombiner how to fold a OR of two shufflevector into a single shufflevector node
Nadav Rotem
nrotem at apple.com
Thu Mar 6 11:25:50 PST 2014
On Mar 6, 2014, at 10:55 AM, Andrea Di Biagio <andrea.dibiagio at gmail.com> wrote:
> Hi Hal,
> thanks for your review!
>
> Nadav, are you happy with this too?
Yes, it looks good to me. Thanks Andrea.
>
> On Thu, Mar 6, 2014 at 5:50 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>> ----- Original Message -----
>>> From: "Andrea Di Biagio" <andrea.dibiagio at gmail.com>
>>> To: "Tim Northover" <t.p.northover at gmail.com>
>>> Cc: "llvm-commits at cs.uiuc.edu for LLVM" <llvm-commits at cs.uiuc.edu>
>>> Sent: Thursday, March 6, 2014 8:25:22 AM
>>> Subject: Re: [PATCH] Teach the DAGCombiner how to fold a OR of two shufflevector into a single shufflevector node
>>>
>>> Hi Nadav, Hal and Tim,
>>>
>>> I modified my original patch adding checks agains
>>> 'TLI.isShuffleMaskLegal' as suggested by Hal.
>>> This new patch now avoids folding an OR of two shuffles if the
>>> resulting shuffle would be illegal for the target.
>>>
>>> The rules for this new combine are:
>>> // fold (or (shuf A, V_0, MaskA), (shuf B, V_0, MaskB)) -> (shuf
>>> A, B, Mask1)
>>> // fold (or (shuf A, V_0, MaskA), (shuf B, V_0, MaskB)) -> (shuf
>>> B, A, Mask2)
>>>
>>> Basically:
>>> we can take advantage of the fact that OR is commutative and compute
>>> two different shuffle masks: Mask1, Mask2.
>>> If the dag node in input can be folded, then we check if Mask1 is a
>>> legal shuffle mask; in case it folds according to the first rule.
>>> Otherwise we checks if Mask2 is legal to see if we can fold according
>>> to the second rule.
>>> If both Masks are illegal then we conservatively don't fold the OR
>>> into a shuffle.
>>
>> LGTM. Thanks!
>>
>> -Hal
>>
>>>
>>> ///
>>> @Nadav
>>> In my experiments on the x86 (Corei7 with and without AVX support) I
>>> noticed that this new patch correctly folds all the interesting cases
>>> except this one:
>>>
>>> define <4 x i32> @foo(<4 x i32> %a, <4 x i32> %b) {
>>> %shuf1 = shufflevector <4 x i32> %a, <4 x i32> zeroinitializer, <4
>>> x
>>> i32><i32 0, i32 4, i32 4, i32 4>
>>> %shuf2 = shufflevector <4 x i32> %b, <4 x i32> zeroinitializer, <4
>>> x
>>> i32><i32 4, i32 0, i32 3, i32 2>
>>> %or = or <4 x i32> %shuf1, %shuf2
>>> ret <4 x i32> %or
>>> }
>>>
>>> The OR in function @foo is not folded because that would result in an
>>> illegal shuffle.
>>> However, that particular case could be still optimized into a
>>> sequence
>>> of two legal shuffle operations (getting rid of the OR entirely).
>>> Ideally we could generate the following assembly:
>>> movlhps %xmm1, %xmm0
>>> shufps $-71 %xmm1, %xmm0
>>>
>>> The transformation would be valid in general if the target can emit
>>> (V)MOVLHPS instructions and if each shuffle in input to the OR has
>>> only one Use.
>>> If ok for you, once this patch is accepted I will send another patch
>>> to improve that particular case on the x86 only (adding a target
>>> specific combine).
>>> ///
>>>
>>> Please let me know if this new patch is reasonable for you.
>>>
>>> Thanks in advance!
>>> Andrea
>>>
>>>
>>> On Wed, Mar 5, 2014 at 9:37 PM, Tim Northover
>>> <t.p.northover at gmail.com> wrote:
>>>>> It is okay to generate new shuffles from a collection of
>>>>> insert/extracts
>>>>> because the new shuffle instructions can't be worse then a
>>>>> collection
>>>>> of inserts and extracts.
>>>>
>>>> Thanks Nadav; it sounds like I should remove the TODO and comment
>>>> on
>>>> the scope of that function then as part of whatever I do.
>>>>
>>>> Tim.
>>>>
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>
>>
>> --
>> Hal Finkel
>> Assistant Computational Scientist
>> Leadership Computing Facility
>> Argonne National Laboratory
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
More information about the llvm-commits
mailing list