[PATCH] [x86] eliminate unnecessary shuffling/moves with unary scalar math ops (PR21507)

Wed May 6 09:17:44 PDT 2015

In http://reviews.llvm.org/D9504#166800, @spatel wrote:

> That said, if this is a common enough occurrence, then what I'd hope to do is just add more defm lines instead of duplicating the multiclass of patterns to match SDNodes rather than Intrinsics, eg:
>
>   defm : scalar_unary_math_patterns<fsqrt, "SQRTSD", X86Movsd, v2f64, UseSSE2>;
>   
>
> ...but I'm not sure how to do that in tablegen. Any suggestions?

I think I have a hack-around: just make the param a string and then cast it, but after a little more thought, it's not enough. If we do want to optimize the scalar op case, we need a whole set of different patterns to match (as we do for the binops earlier in this file). Ie, we're looking for something like this:

      0x7fcbf987f8c0: f64 = extract_vector_elt 0x7fcbf987f530, 0x7fcbf987f790 [ORD=2]
    0x7fcbf987f9f0: f64 = fsqrt 0x7fcbf987f8c0 [ORD=3]
  0x7fcbf987fb20: v2f64 = insert_vector_elt 0x7fcbf987f530, 0x7fcbf987f9f0, 0x7fcbf987f790 [ORD=4]

http://reviews.llvm.org/D9504

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/