Shufflevector & InstCombine

Mon Aug 5 10:53:26 PDT 2013

Hi James, 

We don’t generate new shuffles because we don’t have a good cost model for shuffles.  The last time we discussed it was in this thread:

http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130429/173217.html

Also, InstCombine should canonicalize, not optimize.  

> Now, there are many clever things InstCombine could do with shufflevector. The two I have in my queue at the moment are:
>   1) Where there's an insertelement into vector B that comes direct from an extractelement of vector A, and vector A's length is less than vector B's, create a shuffle to extend A then another shuffle to perform the equivalent of extract/insertelement.

This won’t work for x86 because it has vector registers of different sizes (512, 256 and 128).  If this is profitable it should be done per-target in SelectionDAG where the target information is available. 

>   2) Where two shuffles' masks could combine to make a monotonically increasing sequence, perform the combination.
> 

This is okay, assuming that:

1.  There are no additional users to the shuffles.
2.  The new shuffle is a NOP, and can be deleted. 

> Both of the above have caveats that can't be said in one sentence, but they're basically rewriting common front-end patterns to make shuffles that correspond to vector extension (VEXT instructions in ARM) or concatenation of subvectors.
> 
> Now, I think these would both be of use to any architecture that has decent shufflevector support, and InstCombine seems like the right place for it. But if InstCombine is supposed to be conservative, where should these optimizations go?
> 

DAGCombine. 

Thanks,
Nadav

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130805/f7c005c2/attachment.html>