[PATCH] SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too.

Nadav Rotem nrotem at apple.com
Tue Oct 29 22:26:47 PDT 2013


Juergen, the patch LGTM.  Thanks!

On Oct 29, 2013, at 7:03 PM, Juergen Ributzka <juergen at apple.com> wrote:

> Hi Nadav,
> 
> I had to revert r191130 a while back due to an issue with the test suite on ARM. I modified the patch to fix the problem. The vector mask is now sign/zero extended (depending on the target) to the desired value type. Usually the sign/zero extension is folded into the comparison that creates the vector mask and the result is the same as with the previous patch. In the case were we are not able to do the folding the min/max pattern wouldn't apply anyways and we just use the old code path.
> 
> Cheers,
> Juergen
> 
> <0001-SelectionDAG.patch>
> 
> On Sep 20, 2013, at 10:03 PM, Juergen Ributzka <juergen at apple.com> wrote:
> 
>> Committed in r191130 and r191131
>> 
>> On Sep 20, 2013, at 9:45 PM, Nadav Rotem <nrotem at apple.com> wrote:
>> 
>>> LGTM. 
>>> 
>>> 
>>> On Sep 20, 2013, at 6:48 PM, Juergen Ributzka <juergen at apple.com> wrote:
>>> 
>>>> Hi @ll,
>>>> 
>>>> the problem these two patches solve is related to the formation of vector min/max instructions on X86 for the following example code:
>>>> 
>>>> define <16 x i16> @split16(<16 x i16> %a, <16 x i16> %b, <16 x i8> %__mask) {
>>>> %1 = icmp ult <16 x i16> %a, %b
>>>> %2 = select <16 x i1> %1, <16 x i16> %a, <16 x i16> %b
>>>> ret <16 x i16> %2
>>>> }
>>>> 
>>>> The Type Legalizer recognizes that VSELECT needs to be split, because the type is to wide for the given target. The same does not always apply to SETCC, because less space is required to encode the result of a comparison. As a result VSELECT is split and SETCC is unrolled into scalar comparisons (depending on the SSE feature level this gets expanded from as little as 17 instruction to over 100).
>>>> 
>>>> The first patch fixes the issue by checking for VSELECT-SETCC patterns in the DAG Combiner. If a matching pattern is found, then the result mask of SETCC is promoted to the expected vector mask for the given target. This mask has usually the same size as the VSELECT return type (except for Intel KNL). Now the type legalizer will split both VSELECT and SETCC.
>>>> 
>>>> This allows the following X86 DAG Combine code to successfully detect the MIN/MAX pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>.
>>>> 
>>>> The second patch addresses a special X86 AVX case. It emulates AVX 256bit MIN/MAX support by splitting the vector. AVX only supports MIN/MAX on 128bit vectors and this patch enables vector splitting for this special case in the X86 DAG Combiner.
>>>> 
>>>> Thanks
>>>> 
>>>> Cheers,
>>>> Juergen
>>>> 
>>>> 
>>>> <minmax1.diff><minmax2.diff>_______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>> 
>> 
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 




More information about the llvm-commits mailing list