[LLVMdev] Strong vs. default phi elimination and single-reg classes

Hal Finkel hfinkel at anl.gov
Thu Jun 7 19:31:52 PDT 2012


Hello again,

I am trying to implement an optimization pass for PowerPC such that
simple loops use the special "counter register" (CTR) to track the
induction variable. This is helpful because, in addition to reducing
register pressure, there is a combined decrement-compare-and-branch
instruction BZND (there are also other related instructions).

I started this process by converting the Hexagon Hardware-Loops pass to
work with analogous PPC instructions. This worked fairly well, but
left a bunch of unneeded unconditional branches. To fix this, I
added support into AnalyzeBranch (and InsertBranch and RemoveBranch).
Unfortunately, this really broke things, the branch instructions (which
both used and defined the count register) were moved around in invalid
ways causing both compile-time (live-out assertions) and run-time
failures. Instead of trying to track down these problems, I thought it
would be better to use a register class instead of the physical
register. This register class has only one register (the count
register), and because of how the loops pass is setup, spilling this
register is never necessary.

Here's the problem: If I use strong phi elimination, then this works,
and the CTR register is allocated as it should be. When using the
default phi elimination, extra copies are introduced (which I don't
completely understand), and the register allocator tries to spill the
count register. For example, with strong-phi elimination, I get (as a
simple example):


BB#0: derived from LLVM BB %entry
            Live Ins: %X3
%vreg2<def> = COPY %X3<kill>; G8RC:%vreg2
%vreg4<def> = LI 2048; GPRC:%vreg4
%vreg3<def> = OR8To4 %vreg2<kill>, %vreg2; GPRC:%vreg3 G8RC:%vreg2
%vreg9<def> = COPY %vreg4<kill>; GPRC:%vreg9,%vreg4
%vreg10<def> = RLDICL %vreg9<kill>, 0, 32; GPRC:%vreg10,%vreg9
%vreg11<def> = MTCTR8r %vreg10<kill>; CTRRC8:%vreg11 GPRC:%vreg10
            Successors according to CFG: BB#1

112B    BB#1: derived from LLVM BB %for.body, ADDRESS TAKEN
            Predecessors according to CFG: BB#0 BB#1
%vreg12<def> = PHI %vreg13, <BB#1>, %vreg11,
<BB#0>;CTRRC8:%vreg12,%vreg13,%vreg11
%vreg5<def> = LDtoc <ga:@a>, %X2; G8RC:%vreg5
%vreg6<def> = LWZ 0, %vreg5; mem:Volatile LD4[@a](tbaa=!"int")
GPRC:%vreg6 G8RC:%vreg5
%vreg7<def> = ADD4 %vreg6<kill>, %vreg3; GPRC:%vreg7,%vreg6,%vreg3
STW %vreg7<kill>, 0, %vreg5<kill>; mem:Volatile ST4[@a](tbaa=!"int")
GPRC:%vreg7 G8RC:%vreg5
%vreg13<def> = COPY %vreg12<kill>; CTRRC8:%vreg13,%vreg12
%vreg13<def> = BDNZ8 %vreg13, <BB#1>; CTRRC8:%vreg13
B <BB#2>
            Successors according to CFG: BB#2 BB#1

but with default phi elimination I get:

0B      BB#0: derived from LLVM BB %entry
            Live Ins: %X3
%vreg2<def> = COPY %X3<kill>; G8RC:%vreg2
%vreg4<def> = LI 2048; GPRC:%vreg4
%vreg3<def> = OR8To4 %vreg2<kill>, %vreg2; GPRC:%vreg3 G8RC:%vreg2
%vreg9<def> = COPY %vreg4<kill>; GPRC:%vreg9,%vreg4
%vreg10<def> = RLDICL %vreg9<kill>, 0, 32; GPRC:%vreg10,%vreg9
%vreg11<def> = MTCTR8r %vreg10<kill>; CTRRC8:%vreg11 GPRC:%vreg10
%vreg14<def> = COPY %vreg11<kill>; CTRRC8:%vreg14,%vreg11
Successors according to CFG: BB#1

128B    BB#1: derived from LLVM BB %for.body, ADDRESS TAKEN
            Predecessors according to CFG: BB#0 BB#1
%vreg12<def> = COPY %vreg14<kill>; CTRRC8:%vreg12,%vreg14
%vreg5<def> = LDtoc <ga:@a>, %X2; G8RC:%vreg5
%vreg6<def> = LWZ 0, %vreg5; mem:Volatile LD4[@a](tbaa=!"int")
GPRC:%vreg6 G8RC:%vreg5
%vreg7<def> = ADD4 %vreg6<kill>, %vreg3; GPRC:%vreg7,%vreg6,%vreg3
STW %vreg7<kill>, 0, %vreg5<kill>; mem:Volatile ST4[@a](tbaa=!"int")
GPRC:%vreg7 G8RC:%vreg5
%vreg14<def> = COPY %vreg13<kill>; CTRRC8:%vreg14,%vreg13
%vreg13<def> = COPY %vreg12<kill>; CTRRC8:%vreg13,%vreg12
%vreg13<def> = BDNZ8 %vreg13, <BB#1>; CTRRC8:%vreg13
B <BB#2>
Successors according to CFG: BB#2 BB#1

Is is those two extra copies at the end that seem to be the problem.
The register allocator believes that it needs two count registers, one
for vreg14 and one for vreg13 (thus the need to spill).

Maybe turning on strong phi elimination by default would be a good
thing, but is there another solution here? Does this problem indicate
that I'm setting up my phi nodes (or register constraints, etc.)
incorrectly?

Thanks in advance,
Hal

-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory



More information about the llvm-dev mailing list