[LLVMdev] RFC: AVX Pattern Specification [LONG]

Chris Lattner clattner at apple.com
Fri May 1 15:50:05 PDT 2009


On May 1, 2009, at 2:47 PM, David Greene wrote:
> On Friday 01 May 2009 13:46, Chris Lattner wrote:
>> Right, a lot of these problems can be solved by some nice refactoring
>> stuff.  I'm also hoping that some of the complexity in defining
>> shuffle matching code can be helped by making the definition of the
>> shuffle patterns more declarative within the td file.  It would be
>> really nice to say that "this shuffle does a  "1,0,3,2 shuffle and  
>> has
>> cost 42" and have tblgen generate all the required matching code.
>
> That would be nice.  Any ideas how this would work?

Nate is currently working on refactoring a bunch of shuffle related  
logic, which includes changing the X86 backend to canonicalize  
shuffles more like the ppc/altivec backend does.  Once that is done, I  
think it would make sense for tblgen to generate some C++ code that  
looks like this:

// MatchVectorShuffle - Matches a shuffle node against the available  
instructions,
// returning the lowest cost one as well as the actual cost of it.
unsigned MatchVectorShuffle(VectorShuffleSDNode *N) {
   unsigned LowestCost = ~0;

   if (N can be matched by movddup) {
     unsigned movddupcost = ...  // can be either constant, or  
callback into subtarget info
     if (LowestCost > movddupcost)
       LowestCost = movddupcost;
       operands = [whatever]
       opcode = X86::MOVDDUP;
     }
   }

   if (N can be matched by movhlps) {
     unsigned movhlpscost = ...
     if (LowestCost > movhlpscost)
       LowestCost = movhlpscost;
       operands = [whatever]
       opcode = X86::MOVHLPS;
     }
   }
   ...
}

The advantage of doing this is that it moves the current heuristics  
for match ordering (which is a poor way to model costs) into a  
declarative form in the .td file.  This is particularly important  
because different chips have different costs!

This generated function could then be called by the actual isel pass  
itself as well as from DAGCombine.  We'd like dagcombine to be able to  
merge two shuffles into one, but it should only do this when the cost  
of the resultant shuffle is less than the two original ones (a simple  
greedy algorithm).

This is vague and hand wavy, but hopefully the idea comes across.  We  
have this in the .td files right now:

;; we already have this
def MOVDDUPrr  : S3DI<0x12, MRMSrcReg, (outs VR128:$dst), (ins  
VR128:$src),
                       "movddup\t{$src, $dst|$dst, $src}",
                       [(set VR128:$dst,(v2f64 (movddup VR128:$src,  
(undef))))]>;

def movddup : PatFrag<(ops node:$lhs, node:$rhs),
                       (vector_shuffle node:$lhs, node:$rhs), [{
   return X86::isMOVDDUPMask(cast<ShuffleVectorSDNode>(N));
}]>;


The goal is to replace the pattern fragment and the C++ code for  
X86::isMOVDDUPMask with something like:

def movddup : PatFrag<(ops node:$lhs, node:$rhs),
                       (vector_shuffle node:$lhs, node:$rhs,
                                       0, 1, 0, 1, Cost<42>)

Alternatively, the cost could be put on the instructions etc, whatever  
makes the most sense.  incidentally, I'm not sure why movddup is  
currently defined to take a LHS/RHS: the RHS should always be undef so  
it should be coded into the movddup def.

Another possible syntax would be to add a special kind of shuffle node  
to give more natural and clean syntax.  This is probably the better  
solution:

def movddup : Shuffle4<VR128, undef, 0, 1, 0, 1>, Cost<42>;

>> While I agree that we want to refactor this, I really don't think  
>> that
>> we should autogenerate .td files from perl.  This has a number of
>> significant logistical problems.  What is it that perl gives you that
>> we can't enhance tblgen to do directly?
>
> Well, mainly it's because we don't have whatever tblgen enhancements  
> we need.
> I'll have to think on this some and see if I can come up with some  
> tblgen
> features that could help.
>
> I was writing a lot of these base classes by hand at first, but  
> there are a
> lot of them (they tend to be very small) and writing them is very  
> mechanical.
> So we probably can enhance tblgen somehow.  I'm just not sure what  
> that looks
> like right now.

Ok.

> Your point is well taken.  Let me think on this a bit.

Thanks Dave! I really appreciate you working in this area,

-Chris




More information about the llvm-dev mailing list