[LLVMdev] Long-Term ISel Design

Fri Mar 18 13:19:20 PDT 2011

On Mar 17, 2011, at 9:32 AM, David A. Greene wrote:
> Chris Lattner <clattner at apple.com> writes:
>>> 1. We have special target-specific operators for certain shuffles in X86,
>>>  such as X86unpckl.
> 
>> It also eliminates a lot of fragility.  Before doing this, X86
>> legalize would have to be very careful to specifically form shuffles
>> that it knew isel would turn into (e.g.) unpck operations.  Now
>> instead of forming specific carefully constructed shuffle masks (not
>> making sure other code doesn't violate them) it can just directly form
>> the X86ISD node.
> 
> Right.  What I've presented would reverse this.  Rather than making
> Legalize have to know about what table-driven isel can and cannot do,
> have table-driven isel run first, see what it can do and then leave
> the rest for manual selection.
> 
> We would still keep the existing pre-table-driven-isel passes so we'd
> still have a chance to do some cleanup before the main table-driven
> isel.
> 
> Obviously a lot of details have to be worked out.

I'm not seeing how this is useful for shuffles.  Since tblgen doesn't generate table based matching for *any* shuffles, all of the matching code would end up as C++ code in X86ISelDagToDag, which would give us all of the problems we had before by moving to X86ISD nodes.

>>> 2. Sometimes DAGs are legal in some contexts but not others and it is a
>>>  pain to deal with.  A good example is VBROADCAST, where a <0,0,0,0>
>>>  shuffle is natively supported if the source vector is in memory.
>>>  Otherwise it's not legal and manual lowering is required.  In this
>>>  case the legality check is doing the DAG match by hand, replicating
>>>  what TableGen-produced code already does.
> 
>> Yes, this isn't good.  Instead, the shuffle should be legalized to
>> something that takes a pointer (memory operand).  That means that X86
>> isel would form the *fully legal* X86ISD node, and nothing would be
>> able to break it and it could never fail to match.
> 
> Well, it dopesn't _have_to_ form an X86ISD node.  I don't do that now.
> But it's fragile in the sense that no one else should mess with that
> piece of the DAG.

I don't consider that an acceptable approach.  There is no way to prevent something from CSE'ing a load away and "breaking" the dag, or moving an add between the nodes etc.  You're violating a design principle of selection dags.

> But the real point is that in forming the X86ISD node currently, I'm
> doing exaclty what the tblgen-generated code already does.  If the
> shuffle doesn't take a memory operand, then I have to lower it to
> something else.  Where I do that (before or after table-driven isel)
> doesn't matter.  I do the same work either way.  But by doing it after I
> avoid writing duplicate DAG matching code in the case where the operand
> is in memory.

I don't agree, and I don't see why this is as bad as you're saying.  The code creating these nodes is already target specific.  You seem to be objecting to this because it is easier to write .td files (which turn into generated isel code) than it is to write legalize code in C++.  If that is the problem you've identified, then why not make it possible to write legalize code in .td files?

>>> These two examples are related: we're duplicating functionality manually
>>> that's already available automatically.
>> 
>> Not sure what you mean by this.
> 
> I mean that in legalize/lowering we're massaging the DAG to get it into
> a state where tabel-driven isel can match it.  There is a lot of code
> like this:
> 
> if (shuffle_is_MOVL)
>  do_nothing_and_return
> 
> It's duplicating exactly the checks that the table-driven isel does
> later.  In the VBROADCASTSS/D case, it's doing an entire DAG match
> to check whether it's implementable with VBROADCASTSS/D.
> 
> Why not just let table-driven isel run first and take care of these
> checks just once?  If something doesn't match, we then know it needs
> manual lowering and selection.

I think that this is just because the current code is in a half converted state.  Bruno can say more, but the ultimate goal is to make ISD::SHUFFLE completely illegal on X86, so you'd never have this sort of thing.

> 
>>>                 legalize
>>>                    |
>>>                    V
>>>       manual lowering (X86ISelLowering)
>>>                    |
>>>                    V
>>>        manual isel (X86ISelDAGToDAG)
>>>                    |
>>>                    V
>>>   table-driven isel (.td files/X86GenDAGISel)
>>>                    |
>>>                    V
>>>     manual isel (some to-be-design piece)
>> 
> 
>> I'm not sure what you mean here.  Are you suggesting that these be
>> completely separate passes over the dag?  Why do "manual isel" and
>> "table driven isel" as separate passes?  If they are interlaced, then
>> how is this different than what we already have?
> 
> No, not as separate passes.  Right now we have code like this in
> X86ISelDAGToDAG:
> 
> X86DAGToDAGISel::Select(SDNode *Node) {
>  ...
>  switch (Opcode) {
>    ...do a bunch of manual selection...
>  }
>  // If we get here we didn't select manually.
>  SelectCode();  // Select via table-driven isel, abort if no match.
> }
> 
> What I'm proposing is that we do this:
> 
> X86DAGToDAGISel::Select(SDNode *Node) {
>  ...
>  switch (Opcode) {
>    ...do a bunch of manual selection, less than before...
>  }
> 
>  // If we get here we didn't select manually.
> 
>  result = SelectCode();  // Select via table-driven isel.
> 
>  if (result_is_good()) {
>    return;
>  }
> 
>  switch (Opcode) {
>    ...do a bunch of manual selection, some that used to be above and in
>       legalize/lowering...
>  }
> 
>  cannot_select_abort();
> }

Ok, much better than separate passes, thanks for the clarification!

-Chris