[LLVMdev] AVX Shuffles & PatLeaf Help Needed

Thu Dec 17 15:16:39 PST 2009

On Dec 17, 2009, at 3:10 PM, David Greene wrote:

> I'm working on debugging AVX shuffles and I ran into an interesting
> problem.
> 
> The current isSHUFPMask predicate in X86ISelLowering needs to be
> generalized to operate on 128-bit or 256-bit masks.  There are
> probably lots of other things to change too (LowerVECTOR_SHUFFLE_4wide,
> etc.) but I'll worry about that later.
> 
> The generalized rule is:
> 
> 1. For the low 64 bits of the result vector, the source can be from
>   the low 128 bits of vector 1.
> 
> 2. For the next 64 bits, the source can be from the low 128 bits of
>   vector 2.
> 
> 3. For the 3rd 64 bits, the source is the high 128 bits of vector 1.
> 
> 4. For the high 64 bits, the source is the high 128 bits of vector 2.
> 
> For 128 bit vectors, steps 3 and 4 are ignored since there are no high
> 128 bits.
> 
> Determining the answer boils down to knowing how big a vector element
> is.  Then we can map operand values to ranges within 64-bit and 128-bit
> chunks and determine the proper index ranges to look for.  For example,
> for 64-bit elements, result element zero must come from index 0 or 1.
> For 32-bit elements, result element zero must come from index 0-3.

David, this is probably the wrong approach, based on the accreted awfulness of the X86 shuffle lowering code, which Eli and I have hacked on to improve somewhat.  The correct approach is probably a rewrite based around what AltiVec does: Canonicalize to byte ops, and write all the patterns once rather than having to look for 6 different variants of the same pattern.

Nate