Another problem with "Recommit r265547, and r265610,r265639,r265657"
Mikael Holmén via llvm-commits
llvm-commits at lists.llvm.org
Fri Apr 29 02:11:32 PDT 2016
Hi Wei,
On 04/28/2016 08:24 PM, Wei Mi wrote:
>> Ok, I tried to make LiveVariable to run again after RegisterCoalescer but
>> hit a fatal error in LiveVariables since we're not on SSA anymore:
>>
>> if (!MRI->isSSA())
>> report_fatal_error("regalloc=... not currently supported with -O0");
>>
>> However, I managed to bring out the big hammer and "repair" LIS at the end
>> of RegisterCoalescer, and then the code compiles succesfully!
>
> That is an interesting result!
It is! Unfortunately I saw I broke several of the basic tests with this
(I think that my "repair" function has previously only been used later
in the compiler, and now when run as early as the coalescer it breaks
something sometimes), but at least it points in the direction that we
can potentially solve this by doing something in RegisterCoalescer.
>
>> However, I'm still curious... do you say that RegisterCoalescer should work
>> and keep LiveVariables properly updated even without kill/dead-flags? Or is
>> already the input to the coalescer broken since kill flags are missing?
>>
>> Is it a bug that LiveVariables removes the kill-flags when it's run
>> somewhere prior to RegisterCoalescer or is this ok?
>
> Before Live Interval Analysis
> 624B %vreg7<def> = COPY %vreg29<kill>; aN32_rN_Pairs:%vreg7,%vreg29
>
> After Live Interval Analysis
> 624B %vreg7<def> = COPY %vreg29; aN32_rN_Pairs:%vreg7,%vreg29
>
> I managed to reproduce the same behavior using foo.red.ll locally so
> now I can see how the kill flag is removed by Live Interval Analysis.
>
> It is done deliberately by LiveInterval Analysis in
> LiveRangeCalc::extendToUses. Here is the related comment:
> // Clear all kill flags. They will be reinserted after register allocation
> // by LiveIntervalAnalysis::addKillFlags().
>
> This looks reasonable because after LiveInterval is setup, following
> passes mostly depends on LiveInterval instead of dead/kill flags.
Ok!
>
>>
>> So when kill and dead flags should be present and who should update them is
>> still quite unclear for me.
>
> Operand kill and dead flags are initially set by LiveVariables
> Analysis. Seems only TwoAddressInstructionPass and Register Scavenger
> requires that. For the other regalloc related passes including
> RegisterCoalescer, they get dead and kill information directly using
> LiveInterval, but they may update Operand kill and dead flags. I guess
> at two points kill and dead flags should be properly set -- before
> TwoAddressInstructionPass and after regalloc - for scavenger. Other
> points, it is not mandatory.
Ok, thanks.
>
>>
>> And if the register allocator has assumptions about how the input code
>> should look, shouldn't there be some checks that those assumptions are
>> fulfilled rather than ending up with invalid live ranges after regalloc that
>> the verifier shouts about?
>
> I agree with you. Ideally verifier should catch the non-meaningful
> live interval about %vreg29 early -- after RegisterCoalescing when we
> turn on -verify-coalescing. Existing verifier doesn't have it maybe
> because that means to recompute and verify if current LiveInterval
> needs to be updated, and it is costly.
>
>
> A still important question is: what is wrong with the LiveInterval
> update in RegisterCoalescer. Although I can reproduce to know how the
> kill flag is cleaned in LiveInterval Analysis, I havn't reproduced the
> LiveInterval update problem in RegisterCoalescer successfully yet.
Yes, I've digged into this some more and I'm beginning to figure out
what the problem is.
In the input code there is a lot of copying between several vregs in the
infinite loop. Some value is materialized, and then it's just copied
around and around between several vregs and is never actually used. This
is exposed by the coalescer and yes, it fails to update the live range
accordingly in one step.
After a few vreg have been coalesced we have:
a0h
[0B,16r:0)[480r,512r:1)[656r,656d:5)[752r,768r:2)[880r,896r:3)[896r,896d:4)
0 at 0B-phi 1 at 480r 2 at 752r 3 at 880r 4 at 896r 5 at 656r
%vreg0 [64r,144r:0)[176B,208r:0) 0 at 64r
%vreg1 [224r,336B:0)[720B,976B:0) 0 at 224r
%vreg2 [208r,336B:0)[720B,976B:0) 0 at 208r
%vreg3 [304r,336B:0)[720B,976B:0) 0 at 304r
%vreg4 [288r,336B:0)[720B,976B:0) 0 at 288r
%vreg6 [560r,576r:0) 0 at 560r
%vreg7 [624r,688r:0) 0 at 624r
%vreg8 [16r,32r:0) 0 at 16r
%vreg9 [32r,128B:0)[176B,240r:0) 0 at 32r
%vreg10 [48r,64r:0) 0 at 48r
%vreg11 [80r,96r:0) 0 at 80r
%vreg12 [368r,400r:0) 0 at 368r
%vreg13 [512r,528r:0) 0 at 512r
%vreg14 [544r,560r:0) 0 at 544r
%vreg16 [192r,224r:0) 0 at 192r
%vreg17 [256r,288r:0) 0 at 256r
%vreg18 [272r,304r:0) 0 at 272r
%vreg19 [736r,944r:0) 0 at 736r
%vreg20 [768r,800r:0) 0 at 768r
%vreg21 [784r,816r:0) 0 at 784r
%vreg22 [800r,848r:0) 0 at 800r
%vreg23 [816r,864r:0) 0 at 816r
%vreg25 [928r,944r:0) 0 at 928r
%vreg26 [240r,256r:0) 0 at 240r
%vreg27 [528r,544r:0) 0 at 528r
%vreg29
[144r,176B:3)[336B,448B:0)[576r,608B:1)[608B,624r:2)[688r,720B:4)
0 at 336B-phi 1 at 576r 2 at 608B-phi 3 at 144r 4 at 688r
RegMasks:
********** MACHINEINSTRS **********
# Machine code for function f3 (#0): Properties: <Post SSA, tracking
liveness, HasVRegs>
Function Live Ins: %a0h in %vreg8
0B BB#0: derived from LLVM BB %0
Live Ins: %a0h
16B %vreg8<def> = COPY %a0h; aNh_0_7:%vreg8
32B %vreg9<def> = COPY %vreg8; aNh_0_7:%vreg9,%vreg8
48B %vreg10<def> = mv32Imm_pseudo 32768, pred:0, pred:%noreg, pred:0,
%ac0<imp-use>, %ac1<imp-use>; aN32_0_7:%vreg10
64B %vreg0<def> = COPY %vreg10; aN32_0_7:%vreg0,%vreg10
80B %vreg11<def> = mv_0_ar16_noLo 0, pred:0, pred:%noreg, pred:0,
%ac0<imp-use>, %ac1<imp-use>; aNh_0_7:%vreg11
96B cmp_nimm16_a16 %vreg11, 0, pred:0, pred:%noreg, pred:0,
%CCReg<imp-def>; aNh_0_7:%vreg11
112B brr_cond <BB#1>, pred:2, pred:%CCReg<kill>, pred:0
Successors according to CFG: BB#1 BB#6
128B BB#6:
Predecessors according to CFG: BB#0
144B %vreg29<def> = COPY %vreg0; aN32_rN_Pairs:%vreg29 aN32_0_7:%vreg0
160B brr_uncond <BB#2>
Successors according to CFG: BB#2(?%)
176B BB#1: derived from LLVM BB %bb24.preheader
Predecessors according to CFG: BB#0
192B %vreg16<def> = shfts_a32_nimm7_a32 %vreg0, -31, pred:0,
pred:%noreg, pred:0, %CCReg<imp-def,dead>, %cuc<imp-use>;
aN32_0_7:%vreg16,%vreg0
208B %vreg2<def> = COPY %vreg0; aN32_0_7:%vreg2,%vreg0
224B %vreg1<def> = COPY %vreg16; aN32_0_7:%vreg1,%vreg16
240B %vreg26<def> = mv_ar16_ar16_lo16In32 %vreg9, pred:0, pred:%noreg,
pred:0, %ac0<imp-use>, %ac1<imp-use>; aN32_0_7:%vreg26 aNh_0_7:%vreg9
256B %vreg17<def> = COPY %vreg26; aN32_0_7:%vreg17,%vreg26
272B %vreg18<def> = shfts_a32_nimm7_a32 %vreg17, -31, pred:0,
pred:%noreg, pred:0, %CCReg<imp-def,dead>, %cuc<imp-use>;
aN32_0_7:%vreg18,%vreg17
288B %vreg4<def> = COPY %vreg17; aN32_0_7:%vreg4,%vreg17
304B %vreg3<def> = COPY %vreg18; aN32_0_7:%vreg3,%vreg18
320B brr_uncond <BB#5>
Successors according to CFG: BB#5(?%)
336B BB#2: derived from LLVM BB %bb2
Predecessors according to CFG: BB#4 BB#6
368B %vreg12<def> = mv_0_ar16_noLo 0, pred:0, pred:%noreg, pred:0,
%ac0<imp-use>, %ac1<imp-use>; aNh_0_7:%vreg12
400B cmp_nimm16_a16 %vreg12, 0, pred:0, pred:%noreg, pred:0,
%CCReg<imp-def>; aNh_0_7:%vreg12
416B brr_cond <BB#4>, pred:3, pred:%CCReg<kill>, pred:0
432B brr_uncond <BB#3>
Successors according to CFG: BB#4 BB#3
448B BB#3: derived from LLVM BB %bb3
Predecessors according to CFG: BB#2
464B ADJCALLSTACKDOWN 0, pred:0, pred:%noreg, pred:0, %sp<imp-def>,
%sp<imp-use>
480B callr <ga:@f1>, pred:0, pred:%noreg, pred:0, %a0_40<imp-def,dead>,
%a1_40<imp-def,dead>, %sp<imp-def>, %CCReg<imp-def,dead>,
%cuc<imp-def,dead>, %sp<imp-use>, %cuc<imp-use>, %a0h<imp-def>, ...
496B ADJCALLSTACKUP 0, 0, pred:0, pred:%noreg, pred:0,
%sp<imp-def,dead>, %sp<imp-use>
512B %vreg13<def> = COPY %a0h; aNh_0_7:%vreg13
528B %vreg27<def> = mv_ar16_ar16_lo16In32 %vreg13, pred:0, pred:%noreg,
pred:0, %ac0<imp-use>, %ac1<imp-use>; aN32_0_7:%vreg27 aNh_0_7:%vreg13
544B %vreg14<def> = COPY %vreg27; aN32_0_7:%vreg14,%vreg27
560B %vreg6<def> = COPY %vreg14; aN32_rN_Pairs:%vreg6 aN32_0_7:%vreg14
576B %vreg29<def> = COPY %vreg6; aN32_rN_Pairs:%vreg29,%vreg6
592B brr_uncond <BB#4>
Successors according to CFG: BB#4(?%)
608B BB#4: derived from LLVM BB %bb13
Predecessors according to CFG: BB#2 BB#3
624B %vreg7<def> = COPY %vreg29; aN32_rN_Pairs:%vreg7,%vreg29
640B ADJCALLSTACKDOWN 0, pred:0, pred:%noreg, pred:0, %sp<imp-def>,
%sp<imp-use>
656B callr <ga:@f1>, pred:0, pred:%noreg, pred:0, %a0_40<imp-def,dead>,
%a1_40<imp-def,dead>, %sp<imp-def>, %CCReg<imp-def,dead>,
%cuc<imp-def,dead>, %sp<imp-use>, %cuc<imp-use>, ...
672B ADJCALLSTACKUP 0, 0, pred:0, pred:%noreg, pred:0,
%sp<imp-def,dead>, %sp<imp-use>
688B %vreg29<def> = COPY %vreg7; aN32_rN_Pairs:%vreg29,%vreg7
704B brr_uncond <BB#2>
Successors according to CFG: BB#2(?%)
720B BB#5: derived from LLVM BB %bb24
Predecessors according to CFG: BB#1 BB#5
736B %vreg19<def> = clearAcc32 pred:0, pred:%noreg, pred:0;
aN32_0_7:%vreg19
752B libcall_CRT_ll_div_r %vreg19, %vreg19, %vreg2, %vreg1,
%a0_32<imp-def>, %a1_32<imp-def>, %CCReg<imp-def,dead>, ...;
aN32_0_7:%vreg19,%vreg2,%vreg1
768B %vreg20<def> = COPY %a0_32; aN32_0_7:%vreg20
784B %vreg21<def> = COPY %a1_32<kill>; aN32_0_7:%vreg21
800B %vreg22<def> = xor_a32_a32_a32 %vreg20, %vreg4, pred:0,
pred:%noreg, pred:0, %CCReg<imp-def,dead>; aN32_0_7:%vreg22,%vreg20,%vreg4
816B %vreg23<def> = xor_a32_a32_a32 %vreg21, %vreg3, pred:0,
pred:%noreg, pred:0, %CCReg<imp-def,dead>; aN32_0_7:%vreg23,%vreg21,%vreg3
832B ADJCALLSTACKDOWN 4, pred:0, pred:%noreg, pred:0, %sp<imp-def>,
%sp<imp-use>
848B push_any32 %vreg22, pred:0, pred:%noreg, pred:0, %sp<imp-def>,
%sp<imp-use>; aN32_0_7:%vreg22
864B push_any32 %vreg23, pred:0, pred:%noreg, pred:0, %sp<imp-def>,
%sp<imp-use>; aN32_0_7:%vreg23
880B %a0_32<def> = COPY %vreg19; aN32_0_7:%vreg19
896B callr <ga:@f2>, pred:0, pred:%noreg, pred:0, %a0_32,
%a0_40<imp-def,dead>, %a1_40<imp-def,dead>, %sp<imp-def>,
%CCReg<imp-def,dead>, %cuc<imp-def,dead>, %sp<imp-use>, %cuc<imp-use>, ...
912B ADJCALLSTACKUP 4, 0, pred:0, pred:%noreg, pred:0,
%sp<imp-def,dead>, %sp<imp-use>
928B %vreg25<def> = mv16Sym_noLo <ga:@g>, pred:0, pred:%noreg, pred:0,
%ac0<imp-use>, %ac1<imp-use>; rN:%vreg25
944B mv_a32_r16_rmod1 %vreg19, %vreg25, pred:0, pred:%noreg, pred:0,
%dc0<imp-use>, %dc1<imp-use>, %ac0<imp-use>, %ac1<imp-use>;
mem:ST2[@g](align=1) aN32_0_7:%vreg19 rN:%vreg25
960B brr_uncond <BB#5>
Successors according to CFG: BB#5(?%)
# End machine code for function f3.
Notice that vreg29 is defined in a few places, and then it's copied to
vreg7, which is then copied back to vreg29. Stupid useless code, but as
far as I can tell the live ranges are ok here.
Then the coalescer says
bb13:
624B %vreg7<def> = COPY %vreg29; aN32_rN_Pairs:%vreg7,%vreg29
Considering merging to aN32_rN_Pairs with %vreg7 in %vreg29
RHS = %vreg7 [624r,688r:0) 0 at 624r
LHS = %vreg29
[144r,176B:3)[336B,448B:0)[576r,608B:1)[608B,624r:2)[688r,720B:4)
0 at 336B-phi 1 at 576r 2 at 608B-phi 3 at 144r 4 at 688r
merge %vreg7:0 at 624r into %vreg29:2 at 608B --> @608B
merge %vreg29:4 at 688r into %vreg7:0 at 624r --> @608B
erased: 688r %vreg29<def> = COPY %vreg7; aN32_rN_Pairs:%vreg29,%vreg7
erased: 624r %vreg7<def> = COPY %vreg29; aN32_rN_Pairs:%vreg7,%vreg29
Success: %vreg7 -> %vreg29
Result = %vreg29 [144r,176B:3)[336B,448B:0)[576r,608B:1)[608B,720B:2)
0 at 336B-phi 1 at 576r 2 at 608B-phi 3 at 144r
and the code now looks like:
********** INTERVALS **********
a0h
[0B,16r:0)[480r,512r:1)[656r,656d:5)[752r,768r:2)[880r,896r:3)[896r,896d:4)
0 at 0B-phi 1 at 480r 2 at 752r 3 at 880r 4 at 896r 5 at 656r
%vreg0 [64r,144r:0)[176B,208r:0) 0 at 64r
%vreg1 [224r,336B:0)[720B,976B:0) 0 at 224r
%vreg2 [208r,336B:0)[720B,976B:0) 0 at 208r
%vreg3 [304r,336B:0)[720B,976B:0) 0 at 304r
%vreg4 [288r,336B:0)[720B,976B:0) 0 at 288r
%vreg6 [560r,576r:0) 0 at 560r
%vreg8 [16r,32r:0) 0 at 16r
%vreg9 [32r,128B:0)[176B,240r:0) 0 at 32r
%vreg10 [48r,64r:0) 0 at 48r
%vreg11 [80r,96r:0) 0 at 80r
%vreg12 [368r,400r:0) 0 at 368r
%vreg13 [512r,528r:0) 0 at 512r
%vreg14 [544r,560r:0) 0 at 544r
%vreg16 [192r,224r:0) 0 at 192r
%vreg17 [256r,288r:0) 0 at 256r
%vreg18 [272r,304r:0) 0 at 272r
%vreg19 [736r,944r:0) 0 at 736r
%vreg20 [768r,800r:0) 0 at 768r
%vreg21 [784r,816r:0) 0 at 784r
%vreg22 [800r,848r:0) 0 at 800r
%vreg23 [816r,864r:0) 0 at 816r
%vreg25 [928r,944r:0) 0 at 928r
%vreg26 [240r,256r:0) 0 at 240r
%vreg27 [528r,544r:0) 0 at 528r
%vreg29 [144r,176B:3)[336B,448B:0)[576r,608B:1)[608B,720B:2) 0 at 336B-phi
1 at 576r 2 at 608B-phi 3 at 144r
RegMasks:
********** MACHINEINSTRS **********
# Machine code for function f3 (#0): Properties: <Post SSA, tracking
liveness, HasVRegs>
Function Live Ins: %a0h in %vreg8
0B BB#0: derived from LLVM BB %0
Live Ins: %a0h
16B %vreg8<def> = COPY %a0h; aNh_0_7:%vreg8
32B %vreg9<def> = COPY %vreg8; aNh_0_7:%vreg9,%vreg8
48B %vreg10<def> = mv32Imm_pseudo 32768, pred:0, pred:%noreg, pred:0,
%ac0<imp-use>, %ac1<imp-use>; aN32_0_7:%vreg10
64B %vreg0<def> = COPY %vreg10; aN32_0_7:%vreg0,%vreg10
80B %vreg11<def> = mv_0_ar16_noLo 0, pred:0, pred:%noreg, pred:0,
%ac0<imp-use>, %ac1<imp-use>; aNh_0_7:%vreg11
96B cmp_nimm16_a16 %vreg11, 0, pred:0, pred:%noreg, pred:0,
%CCReg<imp-def>; aNh_0_7:%vreg11
112B brr_cond <BB#1>, pred:2, pred:%CCReg<kill>, pred:0
Successors according to CFG: BB#1 BB#6
128B BB#6:
Predecessors according to CFG: BB#0
144B %vreg29<def> = COPY %vreg0; aN32_rN_Pairs:%vreg29 aN32_0_7:%vreg0
160B brr_uncond <BB#2>
Successors according to CFG: BB#2(?%)
176B BB#1: derived from LLVM BB %bb24.preheader
Predecessors according to CFG: BB#0
192B %vreg16<def> = shfts_a32_nimm7_a32 %vreg0, -31, pred:0,
pred:%noreg, pred:0, %CCReg<imp-def,dead>, %cuc<imp-use>;
aN32_0_7:%vreg16,%vreg0
208B %vreg2<def> = COPY %vreg0; aN32_0_7:%vreg2,%vreg0
224B %vreg1<def> = COPY %vreg16; aN32_0_7:%vreg1,%vreg16
240B %vreg26<def> = mv_ar16_ar16_lo16In32 %vreg9, pred:0, pred:%noreg,
pred:0, %ac0<imp-use>, %ac1<imp-use>; aN32_0_7:%vreg26 aNh_0_7:%vreg9
256B %vreg17<def> = COPY %vreg26; aN32_0_7:%vreg17,%vreg26
272B %vreg18<def> = shfts_a32_nimm7_a32 %vreg17, -31, pred:0,
pred:%noreg, pred:0, %CCReg<imp-def,dead>, %cuc<imp-use>;
aN32_0_7:%vreg18,%vreg17
288B %vreg4<def> = COPY %vreg17; aN32_0_7:%vreg4,%vreg17
304B %vreg3<def> = COPY %vreg18; aN32_0_7:%vreg3,%vreg18
320B brr_uncond <BB#5>
Successors according to CFG: BB#5(?%)
336B BB#2: derived from LLVM BB %bb2
Predecessors according to CFG: BB#4 BB#6
368B %vreg12<def> = mv_0_ar16_noLo 0, pred:0, pred:%noreg, pred:0,
%ac0<imp-use>, %ac1<imp-use>; aNh_0_7:%vreg12
400B cmp_nimm16_a16 %vreg12, 0, pred:0, pred:%noreg, pred:0,
%CCReg<imp-def>; aNh_0_7:%vreg12
416B brr_cond <BB#4>, pred:3, pred:%CCReg<kill>, pred:0
432B brr_uncond <BB#3>
Successors according to CFG: BB#4 BB#3
448B BB#3: derived from LLVM BB %bb3
Predecessors according to CFG: BB#2
464B ADJCALLSTACKDOWN 0, pred:0, pred:%noreg, pred:0, %sp<imp-def>,
%sp<imp-use>
480B callr <ga:@f1>, pred:0, pred:%noreg, pred:0, %a0_40<imp-def,dead>,
%a1_40<imp-def,dead>, %sp<imp-def>, %CCReg<imp-def,dead>,
%cuc<imp-def,dead>, %sp<imp-use>, %cuc<imp-use>, %a0h<imp-def>, ...
496B ADJCALLSTACKUP 0, 0, pred:0, pred:%noreg, pred:0,
%sp<imp-def,dead>, %sp<imp-use>
512B %vreg13<def> = COPY %a0h; aNh_0_7:%vreg13
528B %vreg27<def> = mv_ar16_ar16_lo16In32 %vreg13, pred:0, pred:%noreg,
pred:0, %ac0<imp-use>, %ac1<imp-use>; aN32_0_7:%vreg27 aNh_0_7:%vreg13
544B %vreg14<def> = COPY %vreg27; aN32_0_7:%vreg14,%vreg27
560B %vreg6<def> = COPY %vreg14; aN32_rN_Pairs:%vreg6 aN32_0_7:%vreg14
576B %vreg29<def> = COPY %vreg6; aN32_rN_Pairs:%vreg29,%vreg6
592B brr_uncond <BB#4>
Successors according to CFG: BB#4(?%)
608B BB#4: derived from LLVM BB %bb13
Predecessors according to CFG: BB#2 BB#3
640B ADJCALLSTACKDOWN 0, pred:0, pred:%noreg, pred:0, %sp<imp-def>,
%sp<imp-use>
656B callr <ga:@f1>, pred:0, pred:%noreg, pred:0, %a0_40<imp-def,dead>,
%a1_40<imp-def,dead>, %sp<imp-def>, %CCReg<imp-def,dead>,
%cuc<imp-def,dead>, %sp<imp-use>, %cuc<imp-use>, ...
672B ADJCALLSTACKUP 0, 0, pred:0, pred:%noreg, pred:0,
%sp<imp-def,dead>, %sp<imp-use>
704B brr_uncond <BB#2>
Successors according to CFG: BB#2(?%)
720B BB#5: derived from LLVM BB %bb24
Predecessors according to CFG: BB#1 BB#5
736B %vreg19<def> = clearAcc32 pred:0, pred:%noreg, pred:0;
aN32_0_7:%vreg19
752B libcall_CRT_ll_div_r %vreg19, %vreg19, %vreg2, %vreg1,
%a0_32<imp-def>, %a1_32<imp-def>, %CCReg<imp-def,dead>, ...;
aN32_0_7:%vreg19,%vreg2,%vreg1
768B %vreg20<def> = COPY %a0_32; aN32_0_7:%vreg20
784B %vreg21<def> = COPY %a1_32<kill>; aN32_0_7:%vreg21
800B %vreg22<def> = xor_a32_a32_a32 %vreg20, %vreg4, pred:0,
pred:%noreg, pred:0, %CCReg<imp-def,dead>; aN32_0_7:%vreg22,%vreg20,%vreg4
816B %vreg23<def> = xor_a32_a32_a32 %vreg21, %vreg3, pred:0,
pred:%noreg, pred:0, %CCReg<imp-def,dead>; aN32_0_7:%vreg23,%vreg21,%vreg3
832B ADJCALLSTACKDOWN 4, pred:0, pred:%noreg, pred:0, %sp<imp-def>,
%sp<imp-use>
848B push_any32 %vreg22, pred:0, pred:%noreg, pred:0, %sp<imp-def>,
%sp<imp-use>; aN32_0_7:%vreg22
864B push_any32 %vreg23, pred:0, pred:%noreg, pred:0, %sp<imp-def>,
%sp<imp-use>; aN32_0_7:%vreg23
880B %a0_32<def> = COPY %vreg19; aN32_0_7:%vreg19
896B callr <ga:@f2>, pred:0, pred:%noreg, pred:0, %a0_32,
%a0_40<imp-def,dead>, %a1_40<imp-def,dead>, %sp<imp-def>,
%CCReg<imp-def,dead>, %cuc<imp-def,dead>, %sp<imp-use>, %cuc<imp-use>, ...
912B ADJCALLSTACKUP 4, 0, pred:0, pred:%noreg, pred:0,
%sp<imp-def,dead>, %sp<imp-use>
928B %vreg25<def> = mv16Sym_noLo <ga:@g>, pred:0, pred:%noreg, pred:0,
%ac0<imp-use>, %ac1<imp-use>; rN:%vreg25
944B mv_a32_r16_rmod1 %vreg19, %vreg25, pred:0, pred:%noreg, pred:0,
%dc0<imp-use>, %dc1<imp-use>, %ac0<imp-use>, %ac1<imp-use>;
mem:ST2[@g](align=1) aN32_0_7:%vreg19 rN:%vreg25
960B brr_uncond <BB#5>
Successors according to CFG: BB#5(?%)
# End machine code for function f3.
Since the only use of vreg29 before the coalescing was the copy to
vreg7, and now vreg7 and vreg29 have been merged, there are no uses left
of vreg29!
But the live range is still
%vreg29 [144r,176B:3)[336B,448B:0)[576r,608B:1)[608B,720B:2) 0 at 336B-phi
1 at 576r 2 at 608B-phi 3 at 144r
So some shrinking of the live range is needed in this case. I saw that
there is some code in the coalescer to do this:
if (ShrinkMainRange) {
LiveInterval &LI = LIS->getInterval(CP.getDstReg());
shrinkToUses(&LI);
}
but ShrinkMainRange is false all the time so that code isn't run.
However, if I force ShrinkMainRange to be true I see:
Shrink: %vreg29 [144r,176B:3)[336B,448B:0)[576r,608B:1)[608B,720B:2)
0 at 336B-phi 1 at 576r 2 at 608B-phi 3 at 144r
Dead PHI at 336B may separate interval
Dead PHI at 608B may separate interval
Shrunk: %vreg29 [144r,144d:3)[576r,576d:1) 0 at x 1 at 576r 2 at x 3 at 144r
Split 2 components: %vreg29 [144r,144d:3)[576r,576d:1) 0 at x 1 at 576r
2 at x 3 at 144r
Success: %vreg7 -> %vreg29
Result = %vreg29 [144r,144d:2) 0 at x 1 at x 2 at 144r
so the range for %vreg29 is shrunk, and also the two (now dead) defs are
split into %vreg29 and %vreg30.
I'm a little concerned about the "0 at x 1 at x" things in %vreg29's range
which I've no idea if they do any harm and if or how they should be removed.
The code after this merge is now
a0h
[0B,16r:0)[480r,512r:1)[656r,656d:5)[752r,768r:2)[880r,896r:3)[896r,896d:4)
0 at 0B-phi 1 at 480r 2 at 752r 3 at 880r 4 at 896r 5 at 656r
%vreg0 [64r,144r:0)[176B,208r:0) 0 at 64r
%vreg1 [224r,336B:0)[720B,976B:0) 0 at 224r
%vreg2 [208r,336B:0)[720B,976B:0) 0 at 208r
%vreg3 [304r,336B:0)[720B,976B:0) 0 at 304r
%vreg4 [288r,336B:0)[720B,976B:0) 0 at 288r
%vreg6 [560r,576r:0) 0 at 560r
%vreg8 [16r,32r:0) 0 at 16r
%vreg9 [32r,128B:0)[176B,240r:0) 0 at 32r
%vreg10 [48r,64r:0) 0 at 48r
%vreg11 [80r,96r:0) 0 at 80r
%vreg12 [368r,400r:0) 0 at 368r
%vreg13 [512r,528r:0) 0 at 512r
%vreg14 [544r,560r:0) 0 at 544r
%vreg16 [192r,224r:0) 0 at 192r
%vreg17 [256r,288r:0) 0 at 256r
%vreg18 [272r,304r:0) 0 at 272r
%vreg19 [736r,944r:0) 0 at 736r
%vreg20 [768r,800r:0) 0 at 768r
%vreg21 [784r,816r:0) 0 at 784r
%vreg22 [800r,848r:0) 0 at 800r
%vreg23 [816r,864r:0) 0 at 816r
%vreg25 [928r,944r:0) 0 at 928r
%vreg26 [240r,256r:0) 0 at 240r
%vreg27 [528r,544r:0) 0 at 528r
%vreg29 [144r,144d:2) 0 at x 1 at x 2 at 144r
%vreg30 [576r,576d:0) 0 at 576r
RegMasks:
********** MACHINEINSTRS **********
# Machine code for function f3 (#0): Properties: <Post SSA, tracking
liveness, HasVRegs>
Function Live Ins: %a0h in %vreg8
0B BB#0: derived from LLVM BB %0
Live Ins: %a0h
16B %vreg8<def> = COPY %a0h; aNh_0_7:%vreg8
32B %vreg9<def> = COPY %vreg8; aNh_0_7:%vreg9,%vreg8
48B %vreg10<def> = mv32Imm_pseudo 32768, pred:0, pred:%noreg, pred:0,
%ac0<imp-use>, %ac1<imp-use>; aN32_0_7:%vreg10
64B %vreg0<def> = COPY %vreg10; aN32_0_7:%vreg0,%vreg10
80B %vreg11<def> = mv_0_ar16_noLo 0, pred:0, pred:%noreg, pred:0,
%ac0<imp-use>, %ac1<imp-use>; aNh_0_7:%vreg11
96B cmp_nimm16_a16 %vreg11, 0, pred:0, pred:%noreg, pred:0,
%CCReg<imp-def>; aNh_0_7:%vreg11
112B brr_cond <BB#1>, pred:2, pred:%CCReg<kill>, pred:0
Successors according to CFG: BB#1 BB#6
128B BB#6:
Predecessors according to CFG: BB#0
144B %vreg29<def,dead> = COPY %vreg0; aN32_rN_Pairs:%vreg29 aN32_0_7:%vreg0
160B brr_uncond <BB#2>
Successors according to CFG: BB#2(?%)
176B BB#1: derived from LLVM BB %bb24.preheader
Predecessors according to CFG: BB#0
192B %vreg16<def> = shfts_a32_nimm7_a32 %vreg0, -31, pred:0,
pred:%noreg, pred:0, %CCReg<imp-def,dead>, %cuc<imp-use>;
aN32_0_7:%vreg16,%vreg0
208B %vreg2<def> = COPY %vreg0; aN32_0_7:%vreg2,%vreg0
224B %vreg1<def> = COPY %vreg16; aN32_0_7:%vreg1,%vreg16
240B %vreg26<def> = mv_ar16_ar16_lo16In32 %vreg9, pred:0, pred:%noreg,
pred:0, %ac0<imp-use>, %ac1<imp-use>; aN32_0_7:%vreg26 aNh_0_7:%vreg9
256B %vreg17<def> = COPY %vreg26; aN32_0_7:%vreg17,%vreg26
272B %vreg18<def> = shfts_a32_nimm7_a32 %vreg17, -31, pred:0,
pred:%noreg, pred:0, %CCReg<imp-def,dead>, %cuc<imp-use>;
aN32_0_7:%vreg18,%vreg17
288B %vreg4<def> = COPY %vreg17; aN32_0_7:%vreg4,%vreg17
304B %vreg3<def> = COPY %vreg18; aN32_0_7:%vreg3,%vreg18
320B brr_uncond <BB#5>
Successors according to CFG: BB#5(?%)
336B BB#2: derived from LLVM BB %bb2
Predecessors according to CFG: BB#4 BB#6
368B %vreg12<def> = mv_0_ar16_noLo 0, pred:0, pred:%noreg, pred:0,
%ac0<imp-use>, %ac1<imp-use>; aNh_0_7:%vreg12
400B cmp_nimm16_a16 %vreg12, 0, pred:0, pred:%noreg, pred:0,
%CCReg<imp-def>; aNh_0_7:%vreg12
416B brr_cond <BB#4>, pred:3, pred:%CCReg<kill>, pred:0
432B brr_uncond <BB#3>
Successors according to CFG: BB#4 BB#3
448B BB#3: derived from LLVM BB %bb3
Predecessors according to CFG: BB#2
464B ADJCALLSTACKDOWN 0, pred:0, pred:%noreg, pred:0, %sp<imp-def>,
%sp<imp-use>
480B callr <ga:@f1>, pred:0, pred:%noreg, pred:0, %a0_40<imp-def,dead>,
%a1_40<imp-def,dead>, %sp<imp-def>, %CCReg<imp-def,dead>,
%cuc<imp-def,dead>, %sp<imp-use>, %cuc<imp-use>, %a0h<imp-def>, ...
496B ADJCALLSTACKUP 0, 0, pred:0, pred:%noreg, pred:0,
%sp<imp-def,dead>, %sp<imp-use>
512B %vreg13<def> = COPY %a0h; aNh_0_7:%vreg13
528B %vreg27<def> = mv_ar16_ar16_lo16In32 %vreg13, pred:0, pred:%noreg,
pred:0, %ac0<imp-use>, %ac1<imp-use>; aN32_0_7:%vreg27 aNh_0_7:%vreg13
544B %vreg14<def> = COPY %vreg27; aN32_0_7:%vreg14,%vreg27
560B %vreg6<def> = COPY %vreg14; aN32_rN_Pairs:%vreg6 aN32_0_7:%vreg14
576B %vreg30<def,dead> = COPY %vreg6; aN32_rN_Pairs:%vreg30,%vreg6
592B brr_uncond <BB#4>
Successors according to CFG: BB#4(?%)
608B BB#4: derived from LLVM BB %bb13
Predecessors according to CFG: BB#2 BB#3
640B ADJCALLSTACKDOWN 0, pred:0, pred:%noreg, pred:0, %sp<imp-def>,
%sp<imp-use>
656B callr <ga:@f1>, pred:0, pred:%noreg, pred:0, %a0_40<imp-def,dead>,
%a1_40<imp-def,dead>, %sp<imp-def>, %CCReg<imp-def,dead>,
%cuc<imp-def,dead>, %sp<imp-use>, %cuc<imp-use>, ...
672B ADJCALLSTACKUP 0, 0, pred:0, pred:%noreg, pred:0,
%sp<imp-def,dead>, %sp<imp-use>
704B brr_uncond <BB#2>
Successors according to CFG: BB#2(?%)
720B BB#5: derived from LLVM BB %bb24
Predecessors according to CFG: BB#1 BB#5
736B %vreg19<def> = clearAcc32 pred:0, pred:%noreg, pred:0;
aN32_0_7:%vreg19
752B libcall_CRT_ll_div_r %vreg19, %vreg19, %vreg2, %vreg1,
%a0_32<imp-def>, %a1_32<imp-def>, %CCReg<imp-def,dead>, ...;
aN32_0_7:%vreg19,%vreg2,%vreg1
768B %vreg20<def> = COPY %a0_32; aN32_0_7:%vreg20
784B %vreg21<def> = COPY %a1_32<kill>; aN32_0_7:%vreg21
800B %vreg22<def> = xor_a32_a32_a32 %vreg20, %vreg4, pred:0,
pred:%noreg, pred:0, %CCReg<imp-def,dead>; aN32_0_7:%vreg22,%vreg20,%vreg4
816B %vreg23<def> = xor_a32_a32_a32 %vreg21, %vreg3, pred:0,
pred:%noreg, pred:0, %CCReg<imp-def,dead>; aN32_0_7:%vreg23,%vreg21,%vreg3
832B ADJCALLSTACKDOWN 4, pred:0, pred:%noreg, pred:0, %sp<imp-def>,
%sp<imp-use>
848B push_any32 %vreg22, pred:0, pred:%noreg, pred:0, %sp<imp-def>,
%sp<imp-use>; aN32_0_7:%vreg22
864B push_any32 %vreg23, pred:0, pred:%noreg, pred:0, %sp<imp-def>,
%sp<imp-use>; aN32_0_7:%vreg23
880B %a0_32<def> = COPY %vreg19; aN32_0_7:%vreg19
896B callr <ga:@f2>, pred:0, pred:%noreg, pred:0, %a0_32,
%a0_40<imp-def,dead>, %a1_40<imp-def,dead>, %sp<imp-def>,
%CCReg<imp-def,dead>, %cuc<imp-def,dead>, %sp<imp-use>, %cuc<imp-use>, ...
912B ADJCALLSTACKUP 4, 0, pred:0, pred:%noreg, pred:0,
%sp<imp-def,dead>, %sp<imp-use>
928B %vreg25<def> = mv16Sym_noLo <ga:@g>, pred:0, pred:%noreg, pred:0,
%ac0<imp-use>, %ac1<imp-use>; rN:%vreg25
944B mv_a32_r16_rmod1 %vreg19, %vreg25, pred:0, pred:%noreg, pred:0,
%dc0<imp-use>, %dc1<imp-use>, %ac0<imp-use>, %ac1<imp-use>;
mem:ST2[@g](align=1) aN32_0_7:%vreg19 rN:%vreg25
960B brr_uncond <BB#5>
Successors according to CFG: BB#5(?%)
# End machine code for function f3.
which looks reasonable to me. The whole progam now compiles succesfully
even if I turn off the repair thing I added previously.
So, we just need to figure out in what cases we need to do the
shrinking, when to set ShrinkMainRange to true. Currently it's only done
in RegisterCoalescer::addUndefFlag called from
RegisterCoalescer::updateRegDefsUses so something more is needed.
I suppose the reason for only doing the shrinking sometimes is to save
compilation time? It should always be ok to run shrinkToUses (on virtual
registers), but in most cases it's not necessary so that's why it's avoided?
So in my code now I do
// Somewhat brute solution to #10204. Always shrink live ranges for
virtual
// registers. Ideally we could identify the exact cases where it's needed
// but for now we shrink the range for all vregs.
if (ShrinkMainRange ||
TargetRegisterInfo::isVirtualRegister(CP.getDstReg())) {
LiveInterval &LI = LIS->getInterval(CP.getDstReg());
shrinkToUses(&LI);
}
and then it seems to work.
Thanks,
Mikael
>
> Thanks,
> Wei.
>
More information about the llvm-commits
mailing list