[llvm-commits] [PATCH] Fix PR11334

Wed Jul 25 23:42:51 PDT 2012

Hi Michael, CC'ing Evan and Chris to see if they have any suggestions for
how best to deal with this issue.

> Just off-topic issue. I had an experimental patch to relax the constrain
> on FP_EXTEND. Plus other fixes, it works with breaking any tests for all
> targets.

This kind of problem pops up all over the place.  I mean the fact that in many
x86 vector operations it's not the number of vector elements that is constant,
it is the number of bits, and thus you have instructions that do fpextend from
<4 x float> to <2 x double> (same number of bits) and so on.  However LLVM SDag
generic nodes suppose that the number of vector elements is constant, eg you
have fpextend from <4 x float> to <4 x double> or <2 x float> to <2 x double>
but not <4 x float> to <2 x double>.

Another example is that x86 has an instruction that turns <4 x i32> into
<2 x i64> by extracting the lower two i32, doing a (un)signed extension to
i64, and sticking them in the result.  This can be represented in the LLVM
IR as:
   <2 x i32> = shufflevector <4 x i32>, undef, <0, 1>
   <2 x i64> = sext <2 x i32> to <2 x i64>
The code generators then fail miserably to produce decent code.  What
happens is that the type legalizer sees that <2 x i32> is illegal and promotes
it to <2 x i64>.  This is good.  However it then has to represent the shuffle
now with a <2 x i64> result, but there is no good way to do it, and some
horrible code is generated instead [*].

For handling integer cases it would be enough to generalize the SDag shuffle
vector node.  For example, suppose this was allowed
   v2i64 = vector_shuffle v4i32, v4i32<0,1>
(currently the result type has to be v2i32) with the meaning that the extra
32 bits of each v2i64 element are undefined.  Then the type legalizer could
form
   v2i64 = vector_shuffle v4i32, v4i32<0,1>
   v2i64 = sign-extend-in-reg v2i64 from v2i32
and the X86 backend could easily pattern match this into the X86 instruction.
What I'm saying is that rather than relaxing conditions on all of the cast
operations (sign extend, zero extend etc) to allow different numbers of result
and operand elements, it suffices to generalize vector shuffle.

However this scheme doesn't work so well for floating point vectors, since it's
not that clear to me what
   v2f64 = vector_shuffle v4f32, v4f32<0,1>
would mean exactly.

So I'm not too short what the best plan is in general.

Ciao, Duncan.

[*] This is the main reason why running the output of the GCC vectorizer through
llc sometimes produces poor code.  The GCC vectorizer only produces operations
that it knows can be represented well by the target processor, so we should
never end up scalarizing but we do.

>
> However, I need more comments and suggestions from community before
> pushing direction that way.
>
> As a short-term solution, this patch only adds a target-specific DAG
> optimization.
>
> Yours
> - Michael
>
> On Wed, 2012-07-25 at 12:58 -0700, Rotem, Nadav wrote:
>> Hi Michael,
>>
>> In your patch you are counting on the type-legalizer to scalarize the FPEXT operation, only to gether it again. I think that the pre-type-legalization DAGCombine code would be short and simple.  Why not implement a DAGCombine optimization which works on vector FPEXT ISDs ?  I understand that it will be more difficult to handle types such as <3 x float>, but are these really important ?
>>
>> Thanks,
>> Nadav
>>
>> -----Original Message-----
>> From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Michael Liao
>> Sent: Wednesday, July 25, 2012 00:28
>> To: llvm-commits at cs.uiuc.edu
>> Subject: [llvm-commits] [PATCH] Fix PR11334
>>
>> Hi
>>
>> Please review the attached patch fixing PR11334. With this patch, the test case in PR11334 could generate the expected insn, CVTPS2PD instead of series of CVTSS2SD. An enhanced test case is included as well.
>>
>> Yours
>> - Michael
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>