[LLVMdev] Long-Term ISel Design

Mon Apr 11 13:16:33 PDT 2011

Chris Lattner <clattner at apple.com> writes:

> To me at least, the right solution to this problem is to make one
> X86ISD node for each legal shuffle.  This has several desirable
> properties:
>
> 1. Legalize is now responsible for eliminating ALL vector_shuffle
>    nodes, and it is really obvious what shuffles it is generating when
>    reading the dag dumps.
> 2. Other code that generates shuffles can just generate their exact
>    node.
> 3. The duplicated isel code is gone, because there is a simple mapping
>    between X86ISD nodes and machine instrs.
> 4. We push fewer shuffle masks through the compiler which is a marginal speedup.

It also has some not-so-desirable properties:

1. We need special isel DAG nodes for each shuffle.  That's an extra
   maintenance cost for something that should be automatable.  More
   shuffle instructions are coming, I'm sure.  I'm worried this won't
   scale.

2. It makes shuffle nodes special relative to other kinds of nodes.  Why
   not have special DAG operators for _every_ kind of machine
   instruction?  I know some have them and that's always been a source
   of confusion for me.

> 2. We *really really* want a way to express shuffle masks in .td
> files, and tblgen should generate the Legalize code.  The "def
> X86Punpcklbw" should include a (per cpu) cost for the shuffle as well
> as the mask that it matches, or a predicate to run if it the mask
> isn't constant.

I agree this is highly desireable.  What's the difference between
generated legalize code and a generated instruction selection matcher?

Can we somehow work this so we don't need special target machine DAG
nodes?  If we're generating legalize code to lower to special target
machine DAG nodes and then matching really simple patterns using those
nodes, it seems to me just as easy to generate code to lower to machine
instructions directly.

Using target-specific DAG nodes as an intermediate step is fine, but I
don't think we want to stop there.

>>> I think that this is just because the current code is in a half
>>> converted state.  Bruno can say more, but the ultimate goal is to make
>>> ISD::SHUFFLE completely illegal on X86, so you'd never have this sort
>>> of thing.
>> 
>> Erm...how would that work exactly?  Manual matching and lowering for all
>> shuffles?  That sounds like a huge step backward.
>

> I'm not sure how it is a step backward.  Legalize already does
> matching to "know and ignore" a shuffle that will match a legal
> instruction.  It isn't hard to change that code to "know and
> transform" it to the instruction.  The net result of this is that the
> duplicate code in isel goes away.

If we can support masks in TableGen and auto-generate this it is
goodness.

> Yes, I certainly agree that tblgen should generate the shuffle
> matching code, I just think that the generated code should be executed
> in the LegalizeOp(shuffle vector) code.

I guess as long as it's automatic I don't particularly care where it
gets done.  :)

Is this plan written up anywhere?  I was really confused by the changes
in x86 isel and it would have been nice to have something to look at.

                                 -Dave