[LLVMdev] Overlapping register classes
Jakob Stoklund Olesen
stoklund at 2pi.dk
Mon Mar 16 23:58:35 PDT 2009
Evan Cheng <echeng at apple.com> writes:
> On Mar 16, 2009, at 11:31 AM, Jakob Stoklund Olesen wrote:
>> The problem is that the source register is allocated before coalescing
>> is attempted. The destination regclass does not backpropagate and
>> so doesn't influence the allocation class.
>
> The coalescer has the capability to coalesce cross register class
> copies. It's not quite done. Try -join-cross-class-copies.
That did the trick! Now my trivial example becomes:
i1_ls:
P0.H = HI(i1_l); P0.L = LO(i1_l);
P1.H = HI(i1_s); P1.L = LO(i1_s);
R0 = B[P0] (Z);
R1 = 1 (X);
R0 = R0 & R1;
B[P1] = R0;
RTS;
The inserted copies are gone.
>> 1. If the def regclass is a subset of the operand regclass, there is
>> no problem. ScheduleDAGSDNodes::AddOperand should simply allow this
>> case.
>>
>> 2. If there is a regclass contained in the def regclass and all the
>> operand regclasses, change the vreg regclass to the intersection.
>> This could be a bad idea if there are many uses with different
>> regclasses.
>>
>> 3. If def and operand regclasses are disjoint, a move is necessary.
>> It should be possible to produce an abstract vreg-vreg copy
>> instruction that changes the regclass. The copy instruction would
>> eventually become a copyRegToReg() call after registers are
>> allocated.
>
> Sure. These tricks can be added by demand. Patches welcome.
I will look into it. It doesn't feel right to insert moves everywhere
and hope the coalescer will remove then again.
I am not completely sure 2. would be a good idea. Changing the vreg to
a smaller regclass would increase the register pressure. For instance,
in the X86 backend you could change the pattern:
def : Pat<(and GR32:$src1, 0xff),
(MOVZX32rr8 (i8 (EXTRACT_SUBREG (MOV32to32_ GR32:$src1),
x86_subreg_8bit)))>
into:
def : Pat<(and GR32_:$src1, 0xff),
(MOVZX32rr8 (i8 (EXTRACT_SUBREG GR32_:$src1,
x86_subreg_8bit)))>
If 2. above were implemented, the vreg representing $src1 would be
forced into GR32_. If it is live for a long time, that might not be a
good thing.
Is the register allocator able to insert moves in this case? A kind of
cross class spilling.
More information about the llvm-dev
mailing list