[llvm-commits] [PATCH] Fix PR11334

Michael Liao michael.liao at intel.com
Thu Jul 26 10:27:21 PDT 2012


On Thu, 2012-07-26 at 08:42 +0200, Duncan Sands wrote:
> Hi Michael, CC'ing Evan and Chris to see if they have any suggestions for
> how best to deal with this issue.
> 
[snipped]
> However this scheme doesn't work so well for floating point vectors, since it's
> not that clear to me what
>    v2f64 = vector_shuffle v4f32, v4f32<0,1>
> would mean exactly.

That looks OK. From the perspective to make backend generate the
proper/efficient code, I have no preference which one will be relaxed.
But, the rationale to relax FP_EXTEND is that we could leverage the
existing optimizations in DAG combination or other target-independent
code generate passes (it's also the one of rationales why PR111334 is
fixed by recovering FP_EXTEND back instead of generating new node in
very early stage) without changing the semantic of ISD opcode or adding
new ones.

Relaxing FP_EXTEND changes that assumption a little bit but my personal
feeling it should be OK. Semantically, it's still what it's. When
input/output vectors are mismatching:
* if input has more elements than output, only the low part are
extended.
* if output has more elements than input (it's most unlikely for
FP_EXTEND), the high part of the output vector is undefined.

Unless the optimization really care about the number of vector elements
(assertions are exceptions), most of them just do optimization based on
ISD opcode.

Yours
- Michael

> 
> So I'm not too short what the best plan is in general.
> 
> Ciao, Duncan.
> 
> [*] This is the main reason why running the output of the GCC vectorizer through
> llc sometimes produces poor code.  The GCC vectorizer only produces operations
> that it knows can be represented well by the target processor, so we should
> never end up scalarizing but we do.
> 
> >
> > However, I need more comments and suggestions from community before
> > pushing direction that way.
> >
> > As a short-term solution, this patch only adds a target-specific DAG
> > optimization.
> >
> > Yours
> > - Michael
> >
> > On Wed, 2012-07-25 at 12:58 -0700, Rotem, Nadav wrote:
> >> Hi Michael,
> >>
> >> In your patch you are counting on the type-legalizer to scalarize the FPEXT operation, only to gether it again. I think that the pre-type-legalization DAGCombine code would be short and simple.  Why not implement a DAGCombine optimization which works on vector FPEXT ISDs ?  I understand that it will be more difficult to handle types such as <3 x float>, but are these really important ?
> >>
> >> Thanks,
> >> Nadav
> >>
> >> -----Original Message-----
> >> From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Michael Liao
> >> Sent: Wednesday, July 25, 2012 00:28
> >> To: llvm-commits at cs.uiuc.edu
> >> Subject: [llvm-commits] [PATCH] Fix PR11334
> >>
> >> Hi
> >>
> >> Please review the attached patch fixing PR11334. With this patch, the test case in PR11334 could generate the expected insn, CVTPS2PD instead of series of CVTSS2SD. An enhanced test case is included as well.
> >>
> >> Yours
> >> - Michael
> >
> >
> > _______________________________________________
> > llvm-commits mailing list
> > llvm-commits at cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits





More information about the llvm-commits mailing list