[PATCH] [X86, AVX] instcombine common cases of vperm2* intrinsics into shuffles

Sanjay Patel spatel at rotateright.com
Fri Mar 20 10:23:12 PDT 2015

Hi andreadb, RKSimon, craig.topper, chandlerc,

vperm2* intrinsics are just shuffles unless a zero mask bit is set. In a few special cases, they're not even shuffles.

Optimizing intrinsics in InstCombine is better than handling this in the front-end for at least two reasons:
1. Optimizing custom-written SSE intrinsic code at -O0 makes vector coders really angry (and so I have some regrets about some patches from last week).
2. Doing mask conversion logic in header files is hard to write and subsequently read.

Unfortunately, we use a magic number (generally assumed to be -1) to specify undef values in shufflevector masks in IR. And apparently, that magic has led to lax coding where we just check <0 for undef. If we had a proper enum for shufflevector mask special values, we could do like the x86 backend has done and easily transform the zero mask bit cases here too. Fixing that could be a follow-on patch. Otherwise, we'll try to deal with matching a 2-shuffle sequence in the x86 backend. But again, that's a separate patch (see the TODO comment in this one).



-------------- next part --------------
A non-text attachment was scrubbed...
Name: D8486.22354.patch
Type: text/x-patch
Size: 13661 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150320/4bd1dd22/attachment.bin>

More information about the llvm-commits mailing list