[PATCH][DAGCombiner] Teach how to combine a pair of shuffles into a single shuffle if the resulting mask is legal.

Mon Jul 14 09:54:15 PDT 2014

On Mon, Jul 14, 2014 at 5:19 PM, Nadav Rotem <nrotem at apple.com> wrote:
> Andrea, Michael,
>
> It may be a good idea to run llvm-stress and check if there are new failures in the X86 backend.  At some point the X86 backend was really good about not crashing with random inputs.
>

Good idea.
Just to be on the safe side, today I have run some extra tests and I
haven't observed any new regression related to this change.
I'll try to use llvm-stress to generate several test cases (I plan to
use those tests to validate future patches as well).

Cheers,
Andrea

> Thanks,
> Nadav
>
>> On Jul 13, 2014, at 2:23 PM, Andrea Di Biagio <andrea.dibiagio at gmail.com> wrote:
>>
>> Hi Michael.
>>
>> Sorry for the problems caused by my commit..
>> I have just committed revision 212915 that I should fix the problems you found.
>> Could you please verify that it works for you as well?
>>
>> As you said, the crash was caused by the lack of a check for legal types.
>> I wrongly assumed that 'isShuffleMaskLegal' was able to accept any
>> vector value types (even illegal types).. But I was wrong.
>> To fix the problem I explicitly added a check to verify if the shuffle
>> value type is legal before calling 'isShuffleMaskLegal'
>>
>> Thanks for the reproducible and for the detailed analysis of the
>> problem found. I have added your reproducible as test case.
>>
>> -Andrea
>>
>>
>> On Sun, Jul 13, 2014 at 9:15 PM, Andrea Di Biagio
>> <andrea.dibiagio at gmail.com> wrote:
>>> Hi Michael,
>>>
>>> Thanks for reporting it.
>>> I am looking at it right now.
>>>
>>>
>>> On Sun, Jul 13, 2014 at 1:32 PM, Kuperstein, Michael M
>>> <michael.m.kuperstein at intel.com> wrote:
>>>> Hi Andrea,
>>>>
>>>> This patch breaks the X86 codegen in some cases, reproducer is attached.
>>>>
>>>> It seems that X86 assumes that isShuffleMaskLegal() is called only after type legalization.
>>>> Calling it with illegal vector types (e.g. v4i8) causes it to crash and burn, because isSHUFPMask() doesn't check that VT.getSizeInBits() > 128, causing a division by 0.
>>>> I could just add a check to isSHUFPMask(), but I see this pattern repeated in the other functions called from isShuffleMaskLegal() as well.
>>>>
>>>> Nadav, is calling isShuffleMaskLegal() from the DAG Combiner before type legalization expected to work?
>>>> If it is, should it just return false when it encounters types it doesn't expect? Right now it returns false for VT.getSizeInBits() == 64, I guess we can change that to <= 64 instead.
>>>>
>>>> Michael
>>>>
>>>> -----Original Message-----
>>>> From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Andrea Di Biagio
>>>> Sent: Tuesday, July 08, 2014 18:31
>>>> To: Nadav Rotem
>>>> Cc: llvm-commits at cs.uiuc.edu for LLVM
>>>> Subject: Re: [PATCH][DAGCombiner] Teach how to combine a pair of shuffles into a single shuffle if the resulting mask is legal.
>>>>
>>>> Thanks for the review Nadav!
>>>>
>>>> Committed revision 212539.
>>>>
>>>>
>>>> On Tue, Jul 8, 2014 at 4:09 PM, Nadav Rotem <nrotem at apple.com> wrote:
>>>>>
>>>>> Thanks for working on it Andrea. This change makes sense to me. I am happy to see that it helps so many cases. LGTM.
>>>>>
>>>>>> On Jul 8, 2014, at 07:51, Andrea Di Biagio <andrea.dibiagio at gmail.com> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> this patch teaches the DAGCombiner how to fold shuffles according to rule:
>>>>>>
>>>>>> (shuffle (shuffle (X, Undef, M0)), Undef, M1) -> (shuffle X, Undef,
>>>>>> M2)
>>>>>>
>>>>>> We do this only if M2 is legal according to method 'isShuffleMaskLegal'.
>>>>>>
>>>>>> DAGCombiner already implemented some rules to simplify a sequence of
>>>>>> shuffles into the vector operand according to rule (shuffle (shuffle
>>>>>> (X, Undef, M0), Undef, M1) -> X
>>>>>>
>>>>>> This patch extends that logic to also cover the case where the two
>>>>>> shuffles in input don't perform an identity operation. To avoid
>>>>>> introducing illegal shuffles that are then expanded during
>>>>>> Legalization into a sub-optimal sequence of instructions, the new
>>>>>> rule would only trigger if if the mask of the resulting shufflevector
>>>>>> is legal (this is done calling method 'TLI.isShuffleMaskLegal').
>>>>>>
>>>>>> In my experiments, this patch improves how shuffles are selected on
>>>>>> x86 (see for example my new tests swizzle-2.ll and swizzle-avx2.ll).
>>>>>> However, in most cases - related to 128-bit vector swizzle oeprations
>>>>>> - the backend already knows how to efficiently combine PSHUFD dag
>>>>>> nodes thanks to the new target combine rules added at revisions
>>>>>> 211889 and 211890.
>>>>>>
>>>>>> In general, this patch has clearly the advantage of being target
>>>>>> independent, since it works on target independent shuffle dag nodes.
>>>>>> That means, potentially all targets (not only x86) can take advantage
>>>>>> of this new rule.
>>>>>> Second, most shuffle pairs can be safely combined before we run the
>>>>>> legalizer on vector operations. This allows us to combine/simplify
>>>>>> some dag nodes earlier in the process and not only immediately before
>>>>>> instruction selection. That said, this patch is _not_ meant to be a
>>>>>> replacement for the target specific combine rules already available
>>>>>> on x86; the backend might still potentially introduce new shuffles
>>>>>> during legalization stage. Another reason why on x86 we still need
>>>>>> target specific combine rules is because this new rule is very simple
>>>>>> and avoids to aggressively optimize shuffles; on the other hand,
>>>>>> rules added at 211890 tends to be more aggressive since they know how
>>>>>> to see through bitcast operations when folding shuffles.
>>>>>>
>>>>>> I only tested this patch on x86. I had a very quick look at the
>>>>>> codegen for test swizzle-2.ll when passing an ARM triple (I tested
>>>>>> only armv7-linux).
>>>>>> I saw that the code generated is overall better with this patch (i.e.
>>>>>> smaller - with less shuffles).
>>>>>>
>>>>>> Please let me know if ok to submit.
>>>>>>
>>>>>> Thanks,
>>>>>> Andrea
>>>>>> <patch-dagcombiner.diff>
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>> ---------------------------------------------------------------------
>>>> Intel Israel (74) Limited
>>>>
>>>> This e-mail and any attachments may contain confidential material for
>>>> the sole use of the intended recipient(s). Any review or distribution
>>>> by others is strictly prohibited. If you are not the intended
>>>> recipient, please contact the sender and delete all copies.
>