[LLVMdev] Multi-Instruction Patterns
Evan Cheng
evan.cheng at apple.com
Wed Sep 24 10:20:48 PDT 2008
On Sep 24, 2008, at 9:15 AM, David Greene wrote:
>
>> 1. Treat these instructions as cross register class copies. The src
>> and dst classes are different (VR128 and FR32) but "compatible".
>
> This seems reasonable.
>
>> 2. Model it as extract_subreg which coalescer can eliminate.
>>
>> #2 is conceptually correct. The problem is 128 bit XMM0 is the same
>> register as 32 bit (or 64 bit) XMM0. So it's not possible to define
>> the super-register / sub-register relationship.
>
> I'm not seeing how this is "conceptually correct." It's a vector
> extract, not
> a subregister. It's just that we want to reuse the same register.
It is though. Sub-register is a machine specific concept. It means
vector_extract can be modeled as subreg_extract on this machine.
Nothing is wrong with thatt.
>
>
> Perhaps the answer is to add vector extract support to the
> coalescer, in
> the same way you added subregister support. I don't understand the
> nitty
> gritty of that, though.
I don't think that's a good idea. Conceptually vector_extract is very
different from a move.
>
>
>> That leaves us with #1. I have added support to coalesce cross-class
>> copies (see CrossClassJoin in SimpleRegisterCoalescing.cpp).
>
> Yep.
As Dan pointed out, #2 is also a workable solution.
>
>> Unfortunately, it breaks a few tests and I haven't had the time to
>> look into them. If that's done, we just need to add the concept of
>> "compatible register classes" and mark MOVPS2SSrr as a copy and *it
>> should just work*.
>
> What about getting tblgen support for the pattern in the .td file?
> That would
> be another way to tackle this and would open up a whole bunch of other
> opportunities. Instcombine could be entirely expressed as a set of
> tblgen
> patterns, for example, which is desireable from a maintenance
> perspective
> and well as new development. It's much easier to write patterns
> than to go
> through all of the manual examination that currently exists in
> instcombine.
I don't think that would work. We still have to model the value as
being produced by an instruction. So either we have to select it to a
target specific instruction or a target independent instruction (i.e.
extract_subreg).
After thinking about this some more, I think #2 is a better solution.
Adding XMM0_32 etc. teaches codegen that only lower 32-bits of the
registers are used. Perhaps this can open up additional optimization
opportunities. On the other hand, adding these registers means
introducing more aliasing which has compile time implication.
Evan
>
>
> -Dave
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
More information about the llvm-dev
mailing list