[LLVMdev] Overlapping register classes
Evan Cheng
echeng at apple.com
Mon Mar 16 22:31:11 PDT 2009
On Mar 16, 2009, at 11:31 AM, Jakob Stoklund Olesen wrote:
> Dan Gohman <gohman at apple.com> writes:
>
>> On Mar 15, 2009, at 2:02 PM, Jakob Stoklund Olesen wrote:
>>> Am I misusing register classes, or is this simply functionality that
>>> has not been written yet? The existing backends seem to have only
>>> one
>>> register class per machine value type.
>>
>> The x86 backend has an example of a partial solution. The GR32
>> register class has a subset, GR32_, which is the registers in GR32
>> that support 8-bit subregs. Instructions that reference 8-bit
>> subregs
>> are emitted with a copy (MOV32to32_) to put the value in a virtual
>> register of the needed class. This copy may then optimized away
>> by subsequent passes.
>
> I missed this before (thanks, Eli). I tried adding the explicit move
> patterns, and at least it compiles correctly now:
>
> i1_ls:
> R0.H = HI(i1_l); R0.L = LO(i1_l);
> P0 = R0;
> R0.H = HI(i1_s); R0.L = LO(i1_s);
> R1 = B[P0] (Z);
> R2 = 1 (X);
> P0 = R0;
> R0 = R1 & R2;
> B[P0] = R0;
> RTS;
>
> The moves (P0 = R0) did not get optimized away by the register
> allocator. RALinScan::attemptTrivialCoalescing almost succeeded; it
> got
> as far as testing if the source register R0 is contained in the
> destination regclass (P). It isn't, so the move stayed in.
>
> The problem is that the source register is allocated before coalescing
> is attempted. The destination regclass does not backpropagate and
> so doesn't influence the allocation class.
The coalescer has the capability to coalesce cross register class
copies. It's not quite done. Try -join-cross-class-copies.
>
> PBQP doesn't even attempt to remove a move unless source and
> destination
> regclasses are identical.
>
>> Right now the x86 target code has to explicitly spell out where
>> such copies are needed. It isn't a lot of trouble because there are
>> a small number of situations where copies are needed. From your
>> description, it sounds like this would be much more significant on
>> blackfin. Handling this automatically seems possible, though this
>> is functionality that has not been written yet.
>
> Yes, inserting explicit patterns everywhere would make a complete mess
> of my InstrInfo.td. All arithmetic requires D-regs, and all load/
> stores
> require P-regs.
>
> It would be fairly simple to insert move instructions in the selection
> DAG after instruction selection is complete. I could do this in my
> InstructionSelect() as a first fix, but I think I would have to do
> something more clever eventually.
>
> I think a few tricks when creating vregs would go a long way:
>
> 1. If the def regclass is a subset of the operand regclass, there is
> no
> problem. ScheduleDAGSDNodes::AddOperand should simply allow this
> case.
>
> 2. If there is a regclass contained in the def regclass and all the
> operand regclasses, change the vreg regclass to the intersection.
> This could be a bad idea if there are many uses with different
> regclasses.
>
> 3. If def and operand regclasses are disjoint, a move is necessary.
> It
> should be possible to produce an abstract vreg-vreg copy instruction
> that changes the regclass. The copy instruction would eventually
> become a copyRegToReg() call after registers are allocated.
Sure. These tricks can be added by demand. Patches welcome.
Evan
>
>> Also, the register allocator and associated passes don't yet know
>> how to handle register classes like this. For example, many
>> architectures like this have an add instruction that can add two
>> address registers, and one that can add two data registers, but
>> not one that can directly add an address register and a data
>> register. In this case, if one operand of an add is in a known
>> class,
>> it may be desireable to allocate the other operand in the same
>> class (in simple cases). In LLVM, this is functionality that is not
>> yet written.
>
> I have the exact same problem on blackfin. I can add D=D+D or P=P+P,
> but no combinations. The same goes for post-modify store: I can have
> base+offset as P+P or I+M, where I and M are further register
> classes I
> didn't tell you about.
>
> One way of handling this would be to mark an instruction with a list
> of
> alternative instructions. The alternatives are functionally
> identical,
> but with different operand and result regclasses. Ideally the
> register
> allocator would choose the best alternative. However, this is a
> rather
> big change in the problem definition for the register allocator. I am
> going to ignore this issue for now and live with a few redundant
> register copies.
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
More information about the llvm-dev
mailing list