[LLVMdev] Register coalescing (Subregs and SuperRegs)
Pranav Bhandarkar
pranavb at codeaurora.org
Mon May 14 16:15:50 PDT 2012
Hi,
Consider this MI code from the hexagon backend.
------------------------------------------------------------------
16B %vreg0<def> = COPY %R0<kill>; IntRegs:%vreg0
32B %vreg1<def> = LDriw %vreg0, 0;
mem:LD4[%a],IntRegs:%vreg1,%vreg0
48B %vreg2<def> = LDriw_indexed %vreg0<kill>, 4;
mem:LD4[%add.ptr] IntRegs:%vreg2,%vreg0
64B %vreg7<def> = COMBINE_rr %vreg2<kill>, %vreg1<kill>;
DoubleRegs:%vreg7 IntRegs:%vreg2,%vreg1
80B %D0<def> = COPY %vreg7<kill>; DoubleRegs:%vreg7
------------------------------------------------------------------
LDriw and LDriw_indexed load 32 -bit words. So %vreg1 and %vreg2 are both
32-bit virtual registers. Hexagon has register pairs and even-odd registers
can be paired to form 64-bit registers.
For instance, physical registers R0 and R1 can form the register pair R1:R0.
Similarly R3:R2 with the odd number register holding the higher 32 bits and
the even numbered register holds the lower 32-bits.
Consider now the COMBINE_rr instruction
------------------------------------------------------------------
%vreg7<def> = COMBINE_rr %vreg2<kill>, %vreg1<kill>;
DoubleRegs:%vreg7,IntRegs:%vreg2,%vreg1
------------------------------------------------------------------
It creates a 64bit vreg by making %vreg2 the higher word and %vreg1 the
lower word in the DoubleReg.
The optimization opportunity here is that if %vreg2 and %vreg1 are allocated
the right registers (odd for %vreg2 and even for %vreg1) then the COMBINE_rr
instruction can be made redundant. For instance if %vreg2 is allocated the
physical register R3 and %vreg1 is allocated R2 then %vreg7 can simply be
the register pair R3:R2 i.e %D0 in the hexagon backend. The question is this
possible in the current setup of the Reg. Coalescer and the Reg. Allocator
? Or is there some target hook that'll help me inform the Register coalescer
or the allocator ?
@Jakob: I noticed your commit last week regarding
TRI::getCommonSuperRegClass(). Can that have a role to play here?
FWIW, the relevant patterns for COMBINE_rr are shown below.
------------------------------------------------------------------
// Combine.
let isPredicable = 1, neverHasSideEffects = 1 in
def COMBINE_rr : ALU32_rr<(outs DoubleRegs:$dst),
(ins IntRegs:$src1, IntRegs:$src2),
"$dst = combine($src1, $src2)",
[]>;
def: Pat<(i64 (or (i64 (shl (i64 DoubleRegs:$srcHigh),
(i32 32))),
(i64 DoubleRegs:$srcLow))),
(i64 (COMBINE_rr (EXTRACT_SUBREG (i64 DoubleRegs:$srcHigh),
subreg_loreg),
(EXTRACT_SUBREG (i64
DoubleRegs:$srcLow), subreg_loreg)))>;
------------------------------------------------------------------
Thanks,
Pranav
Qualcomm Innovation Center, (QuIC) is a member of the Code Aurora Forum.
More information about the llvm-dev
mailing list