[LLVMdev] Multi-Instruction Patterns

Wed Sep 24 13:06:01 PDT 2008

On Wednesday 24 September 2008 12:20, Evan Cheng wrote:

> > I'm not seeing how this is "conceptually correct."  It's a vector
> > extract, not
> > a subregister.  It's just that we want to reuse the same register.
>
> It is though. Sub-register is a machine specific concept. It means
> vector_extract can be modeled as subreg_extract on this machine.
> Nothing is wrong with thatt.

I didn't mean to imply anything was "wrong."  It just strikes me as kind of 
strange, in a mind-warping kind of way.  :)

> > Perhaps the answer is to add vector extract support to the
> > coalescer, in
> > the same way you added subregister support.  I don't understand the
> > nitty
> > gritty of that, though.
>
> I don't think that's a good idea. Conceptually vector_extract is very
> different from a move.

Ok, I agree.

> >> That leaves us with #1. I have added support to coalesce cross-class
> >> copies (see CrossClassJoin in SimpleRegisterCoalescing.cpp).
> >
> > Yep.
>
> As Dan pointed out, #2 is also a workable solution.

Yes, I like Dan's proposal.

> > What about getting tblgen support for the pattern in the .td file?
> > That would
> > be another way to tackle this and would open up a whole bunch of other
> > opportunities.  Instcombine could be entirely expressed as a set of
> > tblgen
> > patterns, for example, which is desireable from a maintenance
> > perspective
> > and well as new development.  It's much easier to write patterns
> > than to go
> > through all of the manual examination that currently exists in
> > instcombine.
>
> I don't think that would work. We still have to model the value as
> being produced by an instruction. So either we have to select it to a
> target specific instruction or a target independent instruction (i.e.
> extract_subreg).

That's right.  The pattern doesn't know if the rest of the vector is going to 
be used elsewhere so we need dataflow information and that implies it needs to 
be done in the coalescer (or some other transformation pass).

> After thinking about this some more, I think #2 is a better solution.
> Adding XMM0_32 etc. teaches codegen that only lower 32-bits of the
> registers are used. Perhaps this can open up additional optimization
> opportunities. On the other hand, adding these registers means
> introducing more aliasing which has compile time implication.

What about XMM0_64?  What about things like AVX which applies the GPR aliasing 
scheme to vectors?  I think this is the right way to go but we need to do 
things in a comprehensive way so we can expand as needed.

                                                   -Dave