[PATCH] [x86] eliminate unnecessary shuffling/moves with unary scalar math ops (PR21507)
Sanjay Patel
spatel at rotateright.com
Wed May 6 09:17:44 PDT 2015
In http://reviews.llvm.org/D9504#166800, @spatel wrote:
> That said, if this is a common enough occurrence, then what I'd hope to do is just add more defm lines instead of duplicating the multiclass of patterns to match SDNodes rather than Intrinsics, eg:
>
> defm : scalar_unary_math_patterns<fsqrt, "SQRTSD", X86Movsd, v2f64, UseSSE2>;
>
>
> ...but I'm not sure how to do that in tablegen. Any suggestions?
I think I have a hack-around: just make the param a string and then cast it, but after a little more thought, it's not enough. If we do want to optimize the scalar op case, we need a whole set of different patterns to match (as we do for the binops earlier in this file). Ie, we're looking for something like this:
0x7fcbf987f8c0: f64 = extract_vector_elt 0x7fcbf987f530, 0x7fcbf987f790 [ORD=2]
0x7fcbf987f9f0: f64 = fsqrt 0x7fcbf987f8c0 [ORD=3]
0x7fcbf987fb20: v2f64 = insert_vector_elt 0x7fcbf987f530, 0x7fcbf987f9f0, 0x7fcbf987f790 [ORD=4]
http://reviews.llvm.org/D9504
EMAIL PREFERENCES
http://reviews.llvm.org/settings/panel/emailpreferences/
More information about the llvm-commits
mailing list