[LLVMdev] Overlapping register classes

Evan Cheng echeng at apple.com
Mon Mar 16 22:31:11 PDT 2009


On Mar 16, 2009, at 11:31 AM, Jakob Stoklund Olesen wrote:

> Dan Gohman <gohman at apple.com> writes:
>
>> On Mar 15, 2009, at 2:02 PM, Jakob Stoklund Olesen wrote:
>>> Am I misusing register classes, or is this simply functionality that
>>> has not been written yet? The existing backends seem to have only  
>>> one
>>> register class per machine value type.
>>
>> The x86 backend has an example of a partial solution.  The GR32
>> register class has a subset, GR32_, which is the registers in GR32
>> that support 8-bit subregs.  Instructions that reference 8-bit  
>> subregs
>> are emitted with a copy (MOV32to32_) to put the value in a virtual
>> register of the needed class.  This copy may then optimized away
>> by subsequent passes.
>
> I missed this before (thanks, Eli).  I tried adding the explicit move
> patterns, and at least it compiles correctly now:
>
> i1_ls:
> 	R0.H = HI(i1_l); R0.L = LO(i1_l);
> 	P0 = R0;
> 	R0.H = HI(i1_s); R0.L = LO(i1_s);
> 	R1 = B[P0] (Z);
> 	R2 = 1 (X);
> 	P0 = R0;
> 	R0 = R1 & R2;
> 	B[P0] = R0;
> 	RTS;
>
> The moves (P0 = R0) did not get optimized away by the register
> allocator.  RALinScan::attemptTrivialCoalescing almost succeeded; it  
> got
> as far as testing if the source register R0 is contained in the
> destination regclass (P).  It isn't, so the move stayed in.
>
> The problem is that the source register is allocated before coalescing
> is attempted.  The destination regclass does not backpropagate and
> so doesn't influence the allocation class.

The coalescer has the capability to coalesce cross register class  
copies. It's not quite done. Try -join-cross-class-copies.

>
> PBQP doesn't even attempt to remove a move unless source and  
> destination
> regclasses are identical.
>
>> Right now the x86 target code has to explicitly spell out where
>> such copies are needed.  It isn't a lot of trouble because there are
>> a small number of situations where copies are needed.  From your
>> description, it sounds like this would be much more significant on
>> blackfin.  Handling this automatically seems possible, though this
>> is functionality that has not been written yet.
>
> Yes, inserting explicit patterns everywhere would make a complete mess
> of my InstrInfo.td.  All arithmetic requires D-regs, and all load/ 
> stores
> require P-regs.
>
> It would be fairly simple to insert move instructions in the selection
> DAG after instruction selection is complete.  I could do this in my
> InstructionSelect() as a first fix, but I think I would have to do
> something more clever eventually.
>
> I think a few tricks when creating vregs would go a long way:
>
> 1. If the def regclass is a subset of the operand regclass, there is  
> no
>   problem.  ScheduleDAGSDNodes::AddOperand should simply allow this
>   case.
>
> 2. If there is a regclass contained in the def regclass and all the
>   operand regclasses, change the vreg regclass to the intersection.
>   This could be a bad idea if there are many uses with different
>   regclasses.
>
> 3. If def and operand regclasses are disjoint, a move is necessary.   
> It
>   should be possible to produce an abstract vreg-vreg copy instruction
>   that changes the regclass.  The copy instruction would eventually
>   become a copyRegToReg() call after registers are allocated.

Sure. These tricks can be added by demand. Patches welcome.

Evan

>
>> Also, the register allocator and associated passes don't yet know
>> how to handle register classes like this.  For example, many
>> architectures like this have an add instruction that can add two
>> address registers, and one that can add two data registers, but
>> not one that can directly add an address register and a data
>> register.  In this case, if one operand of an add is in a known  
>> class,
>> it may be desireable to allocate the other operand in the same
>> class (in simple cases).  In LLVM, this is functionality that is not
>> yet written.
>
> I have the exact same problem on blackfin.  I can add D=D+D or P=P+P,
> but no combinations.  The same goes for post-modify store: I can have
> base+offset as P+P or I+M, where I and M are further register  
> classes I
> didn't tell you about.
>
> One way of handling this would be to mark an instruction with a list  
> of
> alternative instructions.  The alternatives are functionally  
> identical,
> but with different operand and result regclasses.  Ideally the  
> register
> allocator would choose the best alternative.  However, this is a  
> rather
> big change in the problem definition for the register allocator.  I am
> going to ignore this issue for now and live with a few redundant
> register copies.
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




More information about the llvm-dev mailing list