[llvm-dev] [ARM] Register pressure with -mthumb forces register reload before each call
Prathamesh Kulkarni via llvm-dev
llvm-dev at lists.llvm.org
Tue Apr 14 17:44:32 PDT 2020
Hi,
I have attached WIP patch for adding foldMemoryOperand to Thumb1InstrInfo.
For the following case:
void f(int x, int y, int z)
{
void bar(int, int, int);
bar(x, y, z);
bar(x, z, y);
bar(y, x, z);
bar(y, y, x);
}
it calls foldMemoryOperand twice, and thus converts two calls from blx to bl.
callMI->dump() shows the function name "bar" correctly, however in
generated assembly call to bar is garbled:
(compiled with -Oz --target=arm-linux-gnueabi -marcha=armv6-m):
add r7, sp, #16
mov r6, r2
mov r5, r1
mov r4, r0
bl "<90>w\n "
mov r1, r2
mov r2, r5
bl "<90>w\n "
mov r0, r5
mov r1, r4
mov r2, r6
ldr r6, .LCPI0_0
blx r6
mov r0, r5
mov r1, r5
mov r2, r4
blx r6
regalloc dump (attached) shows:
Inline spilling tGPR:%9 [80r,152r:0) 0 at 80r weight:3.209746e-03
>From original %3
also spill snippet %8 [152r,232r:0) 0 at 152r weight:2.104167e-03
tBL 14, $noreg, &bar, implicit-def $lr, implicit $sp, implicit
killed $r0, implicit killed $r1, implicit killed $r2
folded: 144r tBL 14, $noreg, &"\E0\9C\06\A0\FC\7F",
implicit-def $lr, implicit $sp, implicit killed $r0, implicit killed
$r1, implicit killed $r2 :: (load 4 from constant-pool)
remat: 228r %10:tgpr = tLDRpci %const.0, 14, $noreg ::
(load 4 from constant-pool)
232e %7:tgpr = COPY killed %10:tgpr
Could you please point out what am I doing wrong in the patch ?
Also, I guess, it only converted two calls to bl because further
spilling wasn't necessary.
However for above case, IIUC, we would want all calls to be converted to bl ?
Since,
4 bl == 16 bytes
2 bl + 2 blx + 1 lr == 2 * 4 (bl) + 2 * 2 (blx) + 1 * 2 (ldr) + 4
bytes (litpool) == 18 bytes
Thanks,
Prathamesh
On Fri, 10 Apr 2020 at 04:22, Prathamesh Kulkarni
<prathamesh.kulkarni at linaro.org> wrote:
>
> Hi John,
> Thanks for the suggestions! I will start looking at adding
> foldMemoryOperand to ARMInstrInfo.
>
> Thanks,
> Prathamesh
>
> On Tue, 7 Apr 2020 at 23:55, John Brawn <John.Brawn at arm.com> wrote:
> >
> > If I'm understanding what's going on in this test correctly, what's happening is:
> > * ARMTargetLowering::LowerCall prefers indirect calls when a function is called at least 3 times in minsize
> > * In thumb 1 (without -fno-omit-frame-pointer) we have effectively only 3 callee-saved registers (r4-r6)
> > * The function has three arguments, so those three plus the register we need to hold the function address is more than our callee-saved registers
> > * Therefore something needs to be spilt
> > * The function address can be rematerialized, so we spill that and insert and LDR before each call
> >
> > If we didn't have this spilling happening (e.g. if the function had one less argument) then the code size of using BL vs BLX
> > * BL: 3*4-byte BL = 12 bytes
> > * BX: 3*2-byte BX + 1*2-byte LDR + 4-byte litpool = 12 bytes
> > (So maybe even not considering spilling, LowerCall should be adjusted to do this for functions called 4 or more times)
> >
> > When we have to spill, if we compare spilling the functions address vs spilling an argument:
> > * BX with spilt fn: 3*2-byte BX + 3*2-byte LDR + 4-byte litpool = 16 bytes
> > * BX with spilt arg: 3*2-byte BX + 1*2-byte LDR + 4-byte litpool + 1*2-byte STR + 2*2-byte LDR = 18 bytes
> > So just changing the spilling heuristic won't work.
> >
> > The two ways I see of fixing this:
> > * In LowerCall only prefer an indirect call if the number of integer register arguments is less than the number of callee-saved registers.
> > * When the load of the function address is spilled, instead of just rematerializing the load instead convert the BX back into BL.
> >
> > The first of these would be easier, but there will be situations where we need to use less than three callee-saved registers (e.g. arguments are loaded from a pointer) and there are situations where we will spill the function address for reasons entirely unrelated to the function arguments (e.g. if we have enough live local variables).
> >
> > For the second, looking at InlineSpiller.cpp it does have the concept of rematerializing by folding a memory operand into another instruction, so I think we could make use of that to do this. It looks like it would involve adding a foldMemoryOperand function to ARMInstrInfo and then have this fold a LDR into a BX by turning it into a BL.
> >
> > John
> >
> > ________________________________
> > From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Prathamesh Kulkarni via llvm-dev <llvm-dev at lists.llvm.org>
> > Sent: 07 April 2020 21:07
> > To: llvm-dev at lists.llvm.org <llvm-dev at lists.llvm.org>
> > Subject: Re: [llvm-dev] [ARM] Register pressure with -mthumb forces register reload before each call
> >
> > On Tue, 31 Mar 2020 at 22:03, Prathamesh Kulkarni
> > <prathamesh.kulkarni at linaro.org> wrote:
> > >
> > > Hi,
> > > Compiling attached test-case, which is reduced version of of
> > > uECC_shared_secret from tinycrypt library [1], with
> > > --target=arm-linux-gnueabi -march=armv6-m -Oz -S
> > > results in reloading of register holding function's address before
> > > every call to blx:
> > >
> > > ldr r3, .LCPI0_0
> > > blx r3
> > > mov r0, r6
> > > mov r1, r5
> > > mov r2, r4
> > > ldr r3, .LCPI0_0
> > > blx r3
> > > ldr r3, .LCPI0_0
> > > mov r0, r6
> > > mov r1, r5
> > > mov r2, r4
> > > blx r3
> > >
> > > .LCPI0_0:
> > > .long foo
> > >
> > > From dump of regalloc (attached), AFAIU, what seems to happen during
> > > greedy allocator is, all virt regs %0 to %3 are live across first two
> > > calls to foo. Thus %0, %1 and %2 get assigned r6, r5 and r4
> > > respectively, and %3 which holds foo's address doesn't have any
> > > register left.
> > > Since it's live-range has least weight, it does not evict any existing interval,
> > > and gets split. Eventually we have the following allocation:
> > >
> > > [%0 -> $r6] tGPR
> > > [%1 -> $r5] tGPR
> > > [%2 -> $r4] tGPR
> > > [%6 -> $r3] tGPR
> > > [%11 -> $r3] tGPR
> > > [%16 -> $r3] tGPR
> > > [%17 -> $r3] tGPR
> > >
> > > where %6, %11, %16 and %17 all are derived from %3.
> > > And since r3 is a call-clobbered register, the compiler is forced to
> > > reload foo's address
> > > each time before blx.
> > >
> > > To fix this, I thought of following approaches:
> > > (a) Disable the heuristic to prefer indirect call when there are at
> > > least 3 calls to
> > > same function in basic block in ARMTargetLowering::LowerCall for Thumb-1 ISA.
> > >
> > > (b) In ARMTargetLowering::LowerCall, put another constraint like
> > > number of arguments, as a proxy for register pressure for Thumb-1, but
> > > that's bound to trip another cases.
> > >
> > > (c) Give higher priority to allocate vrit reg used for indirect calls
> > > ? However, if that
> > > results in spilling of some other register, it would defeat the
> > > purpose of saving code-size. I suppose ideally we want to trigger the
> > > heuristic of using indirect call only when we know beforehand that it
> > > will not result in spilling. But I am not sure if it's possible to
> > > estimate that during isel ?
> > >
> > > I would be grateful for suggestions on how to proceed further.
> > ping ?
> >
> > Thanks,
> > Prathamesh
> > >
> > > [1] https://github.com/intel/tinycrypt/blob/master/lib/source/ecc_dh.c#L139
> > >
> > > Thanks,
> > > Prathamesh
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
Computing live-in reg-units in ABI blocks.
0B %bb.0 R0#0 R1#0 R2#0
Created 3 new intervals.
********** INTERVALS **********
R0 [0B,48r:0)[96r,144r:4)[192r,240r:3)[288r,336r:2)[384r,432r:1) 0 at 0B-phi 1 at 384r 2 at 288r 3 at 192r 4 at 96r
R1 [0B,32r:0)[112r,144r:4)[208r,240r:3)[304r,336r:2)[400r,432r:1) 0 at 0B-phi 1 at 400r 2 at 304r 3 at 208r 4 at 112r
R2 [0B,16r:0)[128r,144r:4)[224r,240r:3)[320r,336r:2)[416r,432r:1) 0 at 0B-phi 1 at 416r 2 at 320r 3 at 224r 4 at 128r
%0 [48r,416r:0) 0 at 48r weight:0.000000e+00
%1 [32r,400r:0) 0 at 32r weight:0.000000e+00
%2 [16r,320r:0) 0 at 16r weight:0.000000e+00
%3 [80r,432r:0) 0 at 80r weight:0.000000e+00
RegMasks: 144r 240r 336r 432r
********** MACHINEINSTRS **********
# Machine code for function f: NoPHIs, TracksLiveness
Constant Pool:
cp#0: @bar, align=4
Function Live Ins: $r0 in %0, $r1 in %1, $r2 in %2
0B bb.0.entry:
liveins: $r0, $r1, $r2
16B %2:tgpr = COPY $r2
32B %1:tgpr = COPY $r1
48B %0:tgpr = COPY $r0
64B ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
80B %3:tgpr = tLDRpci %const.0, 14, $noreg :: (load 4 from constant-pool)
96B $r0 = COPY %0:tgpr
112B $r1 = COPY %1:tgpr
128B $r2 = COPY %2:tgpr
144B tBLXr 14, $noreg, %3:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
160B ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
176B ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
192B $r0 = COPY %0:tgpr
208B $r1 = COPY %2:tgpr
224B $r2 = COPY %1:tgpr
240B tBLXr 14, $noreg, %3:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
256B ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
272B ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
288B $r0 = COPY %1:tgpr
304B $r1 = COPY %0:tgpr
320B $r2 = COPY %2:tgpr
336B tBLXr 14, $noreg, %3:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
352B ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
368B ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
384B $r0 = COPY %1:tgpr
400B $r1 = COPY %1:tgpr
416B $r2 = COPY %0:tgpr
432B tBLXr 14, $noreg, %3:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
448B ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
464B tBX_RET 14, $noreg
# End machine code for function f.
********** SIMPLE REGISTER COALESCING **********
********** Function: f
********** JOINING INTERVALS ***********
entry:
16B %2:tgpr = COPY $r2
Considering merging %2 with $r2
Can only merge into reserved registers.
32B %1:tgpr = COPY $r1
Considering merging %1 with $r1
Can only merge into reserved registers.
48B %0:tgpr = COPY $r0
Considering merging %0 with $r0
Can only merge into reserved registers.
96B $r0 = COPY %0:tgpr
Considering merging %0 with $r0
Can only merge into reserved registers.
112B $r1 = COPY %1:tgpr
Considering merging %1 with $r1
Can only merge into reserved registers.
128B $r2 = COPY %2:tgpr
Considering merging %2 with $r2
Can only merge into reserved registers.
192B $r0 = COPY %0:tgpr
Considering merging %0 with $r0
Can only merge into reserved registers.
208B $r1 = COPY %2:tgpr
Considering merging %2 with $r1
Can only merge into reserved registers.
224B $r2 = COPY %1:tgpr
Considering merging %1 with $r2
Can only merge into reserved registers.
288B $r0 = COPY %1:tgpr
Considering merging %1 with $r0
Can only merge into reserved registers.
304B $r1 = COPY %0:tgpr
Considering merging %0 with $r1
Can only merge into reserved registers.
320B $r2 = COPY %2:tgpr
Considering merging %2 with $r2
Can only merge into reserved registers.
384B $r0 = COPY %1:tgpr
Considering merging %1 with $r0
Can only merge into reserved registers.
400B $r1 = COPY %1:tgpr
Considering merging %1 with $r1
Can only merge into reserved registers.
416B $r2 = COPY %0:tgpr
Considering merging %0 with $r2
Can only merge into reserved registers.
96B $r0 = COPY %0:tgpr
Considering merging %0 with $r0
Can only merge into reserved registers.
112B $r1 = COPY %1:tgpr
Considering merging %1 with $r1
Can only merge into reserved registers.
128B $r2 = COPY %2:tgpr
Considering merging %2 with $r2
Can only merge into reserved registers.
192B $r0 = COPY %0:tgpr
Considering merging %0 with $r0
Can only merge into reserved registers.
208B $r1 = COPY %2:tgpr
Considering merging %2 with $r1
Can only merge into reserved registers.
224B $r2 = COPY %1:tgpr
Considering merging %1 with $r2
Can only merge into reserved registers.
288B $r0 = COPY %1:tgpr
Considering merging %1 with $r0
Can only merge into reserved registers.
304B $r1 = COPY %0:tgpr
Considering merging %0 with $r1
Can only merge into reserved registers.
320B $r2 = COPY %2:tgpr
Considering merging %2 with $r2
Can only merge into reserved registers.
384B $r0 = COPY %1:tgpr
Considering merging %1 with $r0
Can only merge into reserved registers.
400B $r1 = COPY %1:tgpr
Considering merging %1 with $r1
Can only merge into reserved registers.
416B $r2 = COPY %0:tgpr
Considering merging %0 with $r2
Can only merge into reserved registers.
Trying to inflate 0 regs.
********** INTERVALS **********
R0 [0B,48r:0)[96r,144r:4)[192r,240r:3)[288r,336r:2)[384r,432r:1) 0 at 0B-phi 1 at 384r 2 at 288r 3 at 192r 4 at 96r
R1 [0B,32r:0)[112r,144r:4)[208r,240r:3)[304r,336r:2)[400r,432r:1) 0 at 0B-phi 1 at 400r 2 at 304r 3 at 208r 4 at 112r
R2 [0B,16r:0)[128r,144r:4)[224r,240r:3)[320r,336r:2)[416r,432r:1) 0 at 0B-phi 1 at 416r 2 at 320r 3 at 224r 4 at 128r
%0 [48r,416r:0) 0 at 48r weight:0.000000e+00
%1 [32r,400r:0) 0 at 32r weight:0.000000e+00
%2 [16r,320r:0) 0 at 16r weight:0.000000e+00
%3 [80r,432r:0) 0 at 80r weight:0.000000e+00
RegMasks: 144r 240r 336r 432r
********** MACHINEINSTRS **********
# Machine code for function f: NoPHIs, TracksLiveness
Constant Pool:
cp#0: @bar, align=4
Function Live Ins: $r0 in %0, $r1 in %1, $r2 in %2
0B bb.0.entry:
liveins: $r0, $r1, $r2
16B %2:tgpr = COPY $r2
32B %1:tgpr = COPY $r1
48B %0:tgpr = COPY $r0
64B ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
80B %3:tgpr = tLDRpci %const.0, 14, $noreg :: (load 4 from constant-pool)
96B $r0 = COPY %0:tgpr
112B $r1 = COPY %1:tgpr
128B $r2 = COPY %2:tgpr
144B tBLXr 14, $noreg, %3:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
160B ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
176B ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
192B $r0 = COPY %0:tgpr
208B $r1 = COPY %2:tgpr
224B $r2 = COPY %1:tgpr
240B tBLXr 14, $noreg, %3:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
256B ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
272B ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
288B $r0 = COPY %1:tgpr
304B $r1 = COPY %0:tgpr
320B $r2 = COPY %2:tgpr
336B tBLXr 14, $noreg, %3:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
352B ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
368B ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
384B $r0 = COPY %1:tgpr
400B $r1 = COPY %1:tgpr
416B $r2 = COPY %0:tgpr
432B tBLXr 14, $noreg, %3:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
448B ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
464B tBX_RET 14, $noreg
# End machine code for function f.
********** GREEDY REGISTER ALLOCATION **********
********** Function: f
********** INTERVALS **********
R0 [0B,48r:0)[96r,144r:4)[192r,240r:3)[288r,336r:2)[384r,432r:1) 0 at 0B-phi 1 at 384r 2 at 288r 3 at 192r 4 at 96r
R1 [0B,32r:0)[112r,144r:4)[208r,240r:3)[304r,336r:2)[400r,432r:1) 0 at 0B-phi 1 at 400r 2 at 304r 3 at 208r 4 at 112r
R2 [0B,16r:0)[128r,144r:4)[224r,240r:3)[320r,336r:2)[416r,432r:1) 0 at 0B-phi 1 at 416r 2 at 320r 3 at 224r 4 at 128r
%0 [48r,416r:0) 0 at 48r weight:6.575521e-03
%1 [32r,400r:0) 0 at 32r weight:7.890625e-03
%2 [16r,320r:0) 0 at 16r weight:5.738636e-03
%3 [80r,432r:0) 0 at 80r weight:3.324468e-03
RegMasks: 144r 240r 336r 432r
********** MACHINEINSTRS **********
# Machine code for function f: NoPHIs, TracksLiveness
Constant Pool:
cp#0: @bar, align=4
Function Live Ins: $r0 in %0, $r1 in %1, $r2 in %2
0B bb.0.entry:
liveins: $r0, $r1, $r2
16B %2:tgpr = COPY $r2
32B %1:tgpr = COPY $r1
48B %0:tgpr = COPY $r0
64B ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
80B %3:tgpr = tLDRpci %const.0, 14, $noreg :: (load 4 from constant-pool)
96B $r0 = COPY %0:tgpr
112B $r1 = COPY %1:tgpr
128B $r2 = COPY %2:tgpr
144B tBLXr 14, $noreg, %3:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
160B ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
176B ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
192B $r0 = COPY %0:tgpr
208B $r1 = COPY %2:tgpr
224B $r2 = COPY %1:tgpr
240B tBLXr 14, $noreg, %3:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
256B ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
272B ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
288B $r0 = COPY %1:tgpr
304B $r1 = COPY %0:tgpr
320B $r2 = COPY %2:tgpr
336B tBLXr 14, $noreg, %3:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
352B ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
368B ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
384B $r0 = COPY %1:tgpr
400B $r1 = COPY %1:tgpr
416B $r2 = COPY %0:tgpr
432B tBLXr 14, $noreg, %3:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
448B ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
464B tBX_RET 14, $noreg
# End machine code for function f.
selectOrSplit tGPR:%0 [48r,416r:0) 0 at 48r weight:6.575521e-03 w=6.575521e-03
AllocationOrder(tGPR) = [ $r0 $r1 $r2 $r3 $r4 $r5 $r6 ]
hints: $r0 $r1 $r2
missed hint $r0
assigning %0 to $r4: R4 [48r,416r:0) 0 at 48r
selectOrSplit tGPR:%1 [32r,400r:0) 0 at 32r weight:7.890625e-03 w=7.890625e-03
hints: $r1 $r0 $r2
missed hint $r1
assigning %1 to $r5: R5 [32r,400r:0) 0 at 32r
selectOrSplit tGPR:%2 [16r,320r:0) 0 at 16r weight:5.738636e-03 w=5.738636e-03
hints: $r2 $r1
missed hint $r2
assigning %2 to $r6: R6 [16r,320r:0) 0 at 16r
selectOrSplit tGPR:%3 [80r,432r:0) 0 at 80r weight:3.324468e-03 w=3.324468e-03
RS_Assign Cascade 0
wait for second round
queuing new interval: %3 [80r,432r:0) 0 at 80r weight:3.324468e-03
selectOrSplit tGPR:%3 [80r,432r:0) 0 at 80r weight:3.324468e-03 w=3.324468e-03
RS_Split Cascade 0
Analyze counted 5 instrs in 1 blocks, through 0 blocks.
tryLocalSplit: 80r 144r 240r 336r 432r
4 regmasks in block: 144r:80r-144r 144r:144r-240r 240r:240r-336r 336r:336r-432r
$r0 80r-144r i=INF extend
$r0 144r-240r i=INF extend
$r0 240r-336r i=INF extend
$r0 336r-432r i=INF end
$r1 80r-144r i=INF extend
$r1 144r-240r i=INF extend
$r1 240r-336r i=INF extend
$r1 336r-432r i=INF end
$r2 80r-144r i=INF extend
$r2 144r-240r i=INF extend
$r2 240r-336r i=INF extend
$r2 336r-432r i=INF end
$r3 80r-144r i=INF extend
$r3 144r-240r i=INF extend
$r3 240r-336r i=INF extend
$r3 336r-432r i=INF end
$r4 80r-144r i=6.575521e-03 w=6.250000e-03 extend
$r4 144r-240r i=6.575521e-03 w=7.575758e-03 (best) extend
$r4 144r-336r i=6.575521e-03 w=8.012821e-03 (best) extend
$r4 144r-432r i=6.575521e-03 w=7.102273e-03 end
$r5 80r-144r i=7.890625e-03 w=6.250000e-03 extend
$r5 144r-240r i=7.890625e-03 w=7.575758e-03 extend
$r5 240r-336r i=7.890625e-03 w=7.575758e-03 extend
$r5 336r-432r i=7.890625e-03 w=5.859375e-03 end
$r6 80r-144r i=5.738636e-03 w=6.250000e-03 extend
$r6 80r-240r i=5.738636e-03 w=6.944444e-03 extend
$r6 80r-336r i=5.738636e-03 w=7.440476e-03 (best) extend
$r6 80r-432r i=5.738636e-03 all
Best local split range: 80r-336r, 1.667770e-03, 4 instrs
enterIntvBefore 80r: not live
leaveIntvAfter 336r: valno 0
useIntv [80B;344r): [80B;344r):1
blit [80r,432r:0): [80r;344r)=1(%5):0 [344r;432r)=0(%4):0
rewr %bb.0 80r:1 %5:tgpr = tLDRpci %const.0, 14, $noreg :: (load 4 from constant-pool)
rewr %bb.0 144B:1 tBLXr 14, $noreg, %5:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
rewr %bb.0 240B:1 tBLXr 14, $noreg, %5:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
rewr %bb.0 336B:1 tBLXr 14, $noreg, %5:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
rewr %bb.0 432B:0 tBLXr 14, $noreg, %4:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
rewr %bb.0 344B:1 %4:tgpr = COPY %5:tgpr
Tagging non-progress ranges: %5
queuing new interval: %4 [344r,432r:0) 0 at 344r weight:2.069672e-03
queuing new interval: %5 [80r,344r:0) 0 at 80r weight:3.802711e-03
selectOrSplit tGPR:%5 [80r,344r:0) 0 at 80r weight:3.802711e-03 w=3.802711e-03
RS_Split2 Cascade 0
Analyze counted 5 instrs in 1 blocks, through 0 blocks.
tryLocalSplit: 80r 144r 240r 336r 344r
4 regmasks in block: 144r:80r-144r 144r:144r-240r 240r:240r-336r 336r:336r-344r
$r0 80r-144r i=INF extend
$r0 144r-240r i=INF extend
$r0 240r-336r i=INF extend
$r0 336r-344r i=INF end
$r1 80r-144r i=INF extend
$r1 144r-240r i=INF extend
$r1 240r-336r i=INF extend
$r1 336r-344r i=INF end
$r2 80r-144r i=INF extend
$r2 144r-240r i=INF extend
$r2 240r-336r i=INF extend
$r2 336r-344r i=INF end
$r3 80r-144r i=INF extend
$r3 144r-240r i=INF extend
$r3 240r-336r i=INF extend
$r3 336r-344r i=INF end
$r4 80r-144r i=6.575521e-03 w=6.250000e-03 extend
$r4 144r-240r i=6.575521e-03 w=7.575758e-03 (best) extend
$r4 144r-336r i=6.575521e-03 shrink
$r4 240r-336r i=6.575521e-03 w=7.575758e-03 (best) extend
$r4 240r-344r i=6.575521e-03 w=7.692308e-03 (best) end
$r5 80r-144r i=7.890625e-03 w=6.250000e-03 extend
$r5 144r-240r i=7.890625e-03 w=7.575758e-03 extend
$r5 240r-336r i=7.890625e-03 w=7.575758e-03 extend
$r5 336r-344r i=7.890625e-03 w=7.075472e-03 end
$r6 80r-144r i=5.738636e-03 w=6.250000e-03 extend
$r6 80r-240r i=5.738636e-03 w=6.944444e-03 (best) extend
$r6 80r-336r i=5.738636e-03 shrink
$r6 144r-336r i=5.738636e-03 shrink
$r6 240r-336r i=5.738636e-03 w=7.575758e-03 (best) extend
$r6 240r-344r i=5.738636e-03 w=7.692308e-03 (best) end
Best local split range: 240r-344r, 1.914560e-03, 3 instrs
enterIntvBefore 240r: valno 0
leaveIntvAfter 344r: not live
useIntv [232r;352B): [232r;352B):1
blit [80r,344r:0): [80r;232r)=0(%6):0 [232r;344r)=1(%7):0
rewr %bb.0 80r:0 %6:tgpr = tLDRpci %const.0, 14, $noreg :: (load 4 from constant-pool)
rewr %bb.0 144B:0 tBLXr 14, $noreg, %6:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
rewr %bb.0 240B:1 tBLXr 14, $noreg, %7:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
rewr %bb.0 336B:1 tBLXr 14, $noreg, %7:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
rewr %bb.0 344B:1 %4:tgpr = COPY %7:tgpr
rewr %bb.0 232B:0 %7:tgpr = COPY %6:tgpr
queuing new interval: %6 [80r,232r:0) 0 at 80r weight:2.744565e-03
queuing new interval: %7 [232r,344r:0) 0 at 232r weight:3.945312e-03
selectOrSplit tGPR:%6 [80r,232r:0) 0 at 80r weight:2.744565e-03 w=2.744565e-03
RS_Assign Cascade 0
wait for second round
queuing new interval: %6 [80r,232r:0) 0 at 80r weight:2.744565e-03
selectOrSplit tGPR:%7 [232r,344r:0) 0 at 232r weight:3.945312e-03 w=3.945312e-03
RS_Assign Cascade 0
wait for second round
queuing new interval: %7 [232r,344r:0) 0 at 232r weight:3.945312e-03
selectOrSplit tGPR:%4 [344r,432r:0) 0 at 344r weight:2.069672e-03 w=2.069672e-03
assigning %4 to $r3: R3 [344r,432r:0) 0 at 344r
selectOrSplit tGPR:%6 [80r,232r:0) 0 at 80r weight:2.744565e-03 w=2.744565e-03
RS_Split Cascade 0
Analyze counted 3 instrs in 1 blocks, through 0 blocks.
tryLocalSplit: 80r 144r 232r
4 regmasks in block: 144r:80r-144r 144r:144r-232r
$r0 80r-144r i=INF extend
$r0 144r-232r i=INF end
$r1 80r-144r i=INF extend
$r1 144r-232r i=INF end
$r2 80r-144r i=INF extend
$r2 144r-232r i=INF end
$r3 80r-144r i=INF extend
$r3 144r-232r i=INF end
$r4 80r-144r i=6.575521e-03 w=6.250000e-03 extend
$r4 144r-232r i=6.575521e-03 w=5.952381e-03 end
$r5 80r-144r i=7.890625e-03 w=6.250000e-03 extend
$r5 144r-232r i=7.890625e-03 w=5.952381e-03 end
$r6 80r-144r i=5.738636e-03 w=6.250000e-03 (best) extend
$r6 80r-232r i=5.738636e-03 all
Best local split range: 80r-144r, 5.011263e-04, 2 instrs
enterIntvBefore 80r: not live
leaveIntvAfter 144r: valno 0
useIntv [80B;152r): [80B;152r):1
blit [80r,232r:0): [80r;152r)=1(%9):0 [152r;232r)=0(%8):0
rewr %bb.0 80r:1 %9:tgpr = tLDRpci %const.0, 14, $noreg :: (load 4 from constant-pool)
rewr %bb.0 144B:1 tBLXr 14, $noreg, %9:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
rewr %bb.0 232B:0 %7:tgpr = COPY %8:tgpr
rewr %bb.0 152B:1 %8:tgpr = COPY %9:tgpr
Tagging non-progress ranges: %9
queuing new interval: %8 [152r,232r:0) 0 at 152r weight:2.104167e-03
queuing new interval: %9 [80r,152r:0) 0 at 80r weight:3.209746e-03
selectOrSplit tGPR:%9 [80r,152r:0) 0 at 80r weight:3.209746e-03 w=3.209746e-03
RS_Split2 Cascade 0
Analyze counted 3 instrs in 1 blocks, through 0 blocks.
tryLocalSplit: 80r 144r 152r
4 regmasks in block: 144r:80r-144r 144r:144r-152r
$r0 80r-144r i=INF extend
$r0 144r-152r i=INF end
$r1 80r-144r i=INF extend
$r1 144r-152r i=INF end
$r2 80r-144r i=INF extend
$r2 144r-152r i=INF end
$r3 80r-144r i=INF extend
$r3 144r-152r i=INF end
$r4 80r-144r i=6.575521e-03 extend
$r4 144r-152r i=6.575521e-03 end
$r5 80r-144r i=7.890625e-03 extend
$r5 144r-152r i=7.890625e-03 end
$r6 80r-144r i=5.738636e-03 extend
$r6 144r-152r i=5.738636e-03 end
Inline spilling tGPR:%9 [80r,152r:0) 0 at 80r weight:3.209746e-03
From original %3
also spill snippet %8 [152r,232r:0) 0 at 152r weight:2.104167e-03
tBL 14, $noreg, &bar, implicit-def $lr, implicit $sp, implicit killed $r0, implicit killed $r1, implicit killed $r2
folded: 144r tBL 14, $noreg, &"\E0\9C\06\A0\FC\7F", implicit-def $lr, implicit $sp, implicit killed $r0, implicit killed $r1, implicit killed $r2 :: (load 4 from constant-pool)
remat: 228r %10:tgpr = tLDRpci %const.0, 14, $noreg :: (load 4 from constant-pool)
232e %7:tgpr = COPY killed %10:tgpr
All defs dead: dead %9:tgpr = tLDRpci %const.0, 14, $noreg :: (load 4 from constant-pool)
All defs dead: dead %8:tgpr = COPY %9:tgpr
Remat created 2 dead defs.
Deleting dead def 152r dead %8:tgpr = COPY %9:tgpr
Deleting dead def 80r dead %9:tgpr = tLDRpci %const.0, 14, $noreg :: (load 4 from constant-pool)
0 registers to spill after remat.
queuing new interval: %10 [228r,232r:0) 0 at 228r weight:INF
selectOrSplit tGPR:%10 [228r,232r:0) 0 at 228r weight:INF w=INF
assigning %10 to $r3: R3 [228r,232r:0) 0 at 228r
Dropping unused %8 EMPTY weight:2.104167e-03
selectOrSplit tGPR:%7 [232r,344r:0) 0 at 232r weight:3.945312e-03 w=3.945312e-03
hints: $r3
RS_Split Cascade 0
Analyze counted 4 instrs in 1 blocks, through 0 blocks.
tryLocalSplit: 232r 240r 336r 344r
4 regmasks in block: 240r:232r-240r 240r:240r-336r 336r:336r-344r
$r3 232r-240r i=INF extend
$r3 240r-336r i=INF extend
$r3 336r-344r i=INF end
$r0 232r-240r i=INF extend
$r0 240r-336r i=INF extend
$r0 336r-344r i=INF end
$r1 232r-240r i=INF extend
$r1 240r-336r i=INF extend
$r1 336r-344r i=INF end
$r2 232r-240r i=INF extend
$r2 240r-336r i=INF extend
$r2 336r-344r i=INF end
$r4 232r-240r i=6.575521e-03 w=7.075472e-03 (best) extend
$r4 232r-336r i=6.575521e-03 w=7.692308e-03 (best) extend
$r4 232r-344r i=6.575521e-03 all
$r5 232r-240r i=7.890625e-03 w=7.075472e-03 extend
$r5 240r-336r i=7.890625e-03 w=7.575758e-03 extend
$r5 336r-344r i=7.890625e-03 w=7.075472e-03 end
$r6 232r-240r i=5.738636e-03 w=7.075472e-03 (best) extend
$r6 232r-336r i=5.738636e-03 w=7.692308e-03 (best) extend
$r6 232r-344r i=5.738636e-03 all
Best local split range: 232r-336r, 1.914560e-03, 3 instrs
enterIntvBefore 232r: not live
leaveIntvAfter 336r: valno 0
useIntv [232B;340r): [232B;340r):1
blit [232r,344r:0): [232r;340r)=1(%13):0 [340r;344r)=0(%12):0
rewr %bb.0 232r:1 %13:tgpr = COPY %10:tgpr
rewr %bb.0 240B:1 tBLXr 14, $noreg, %13:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
rewr %bb.0 336B:1 tBLXr 14, $noreg, %13:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
rewr %bb.0 344B:0 %4:tgpr = COPY %12:tgpr
rewr %bb.0 340B:1 %12:tgpr = COPY %13:tgpr
Tagging non-progress ranges: %13
queuing new interval: %12 [340r,344r:0) 0 at 340r weight:INF
queuing new interval: %13 [232r,340r:0) 0 at 232r weight:3.976378e-03
selectOrSplit tGPR:%13 [232r,340r:0) 0 at 232r weight:3.976378e-03 w=3.976378e-03
hints: $r3
RS_Split2 Cascade 0
Analyze counted 4 instrs in 1 blocks, through 0 blocks.
tryLocalSplit: 232r 240r 336r 340r
4 regmasks in block: 240r:232r-240r 240r:240r-336r 336r:336r-340r
$r3 232r-240r i=INF extend
$r3 240r-336r i=INF extend
$r3 336r-340r i=INF end
$r0 232r-240r i=INF extend
$r0 240r-336r i=INF extend
$r0 336r-340r i=INF end
$r1 232r-240r i=INF extend
$r1 240r-336r i=INF extend
$r1 336r-340r i=INF end
$r2 232r-240r i=INF extend
$r2 240r-336r i=INF extend
$r2 336r-340r i=INF end
$r4 232r-240r i=6.575521e-03 w=7.075472e-03 (best) extend
$r4 232r-336r i=6.575521e-03 shrink
$r4 240r-336r i=6.575521e-03 extend
$r4 336r-340r i=6.575521e-03 w=7.142857e-03 (best) end
$r5 232r-240r i=7.890625e-03 w=7.075472e-03 extend
$r5 240r-336r i=7.890625e-03 extend
$r5 336r-340r i=7.890625e-03 w=7.142857e-03 end
$r6 232r-240r i=5.738636e-03 w=7.075472e-03 (best) extend
$r6 232r-336r i=5.738636e-03 shrink
$r6 240r-336r i=5.738636e-03 extend
$r6 336r-340r i=0.000000e+00 w=7.142857e-03 (best) end
Best local split range: 336r-340r, 6.999861e-03, 2 instrs
enterIntvBefore 336r: valno 0
leaveIntvAfter 340r: not live
useIntv [328r;344B): [328r;344B):1
blit [232r,340r:0): [232r;328r)=0(%14):0 [328r;340r)=1(%15):0
rewr %bb.0 232r:0 %14:tgpr = COPY %10:tgpr
rewr %bb.0 240B:0 tBLXr 14, $noreg, %14:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
rewr %bb.0 336B:1 tBLXr 14, $noreg, %15:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
rewr %bb.0 340B:1 %12:tgpr = COPY %15:tgpr
rewr %bb.0 328B:0 %15:tgpr = COPY %14:tgpr
queuing new interval: %14 [232r,328r:0) 0 at 232r weight:3.054435e-03
queuing new interval: %15 [328r,340r:0) 0 at 328r weight:3.677184e-03
selectOrSplit tGPR:%14 [232r,328r:0) 0 at 232r weight:3.054435e-03 w=3.054435e-03
hints: $r3
RS_Assign Cascade 0
wait for second round
queuing new interval: %14 [232r,328r:0) 0 at 232r weight:3.054435e-03
selectOrSplit tGPR:%12 [340r,344r:0) 0 at 340r weight:INF w=INF
hints: $r3
assigning %12 to $r3: R3 [340r,344r:0) 0 at 340r
selectOrSplit tGPR:%15 [328r,340r:0) 0 at 328r weight:3.677184e-03 w=3.677184e-03
hints: $r3
assigning %15 to $r6: R6 [328r,340r:0) 0 at 328r
selectOrSplit tGPR:%14 [232r,328r:0) 0 at 232r weight:3.054435e-03 w=3.054435e-03
hints: $r3 $r6
RS_Split Cascade 0
Analyze counted 3 instrs in 1 blocks, through 0 blocks.
tryLocalSplit: 232r 240r 328r
4 regmasks in block: 240r:232r-240r 240r:240r-328r
$r3 232r-240r i=INF extend
$r3 240r-328r i=INF end
$r6 232r-240r i=5.738636e-03 w=7.075472e-03 (best) extend
$r6 232r-328r i=5.738636e-03 all
$r0 232r-240r i=INF extend
$r0 240r-328r i=INF end
$r1 232r-240r i=INF extend
$r1 240r-328r i=INF end
$r2 232r-240r i=INF extend
$r2 240r-328r i=INF end
$r4 232r-240r i=6.575521e-03 w=7.075472e-03 extend
$r4 232r-328r i=6.575521e-03 all
$r5 232r-240r i=7.890625e-03 w=7.075472e-03 extend
$r5 240r-328r i=7.890625e-03 w=5.952381e-03 end
Best local split range: 232r-240r, 1.310072e-03, 2 instrs
enterIntvBefore 232r: not live
leaveIntvAfter 240r: valno 0
useIntv [232B;248r): [232B;248r):1
blit [232r,328r:0): [232r;248r)=1(%17):0 [248r;328r)=0(%16):0
rewr %bb.0 232r:1 %17:tgpr = COPY %10:tgpr
rewr %bb.0 240B:1 tBLXr 14, $noreg, %17:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
rewr %bb.0 328B:0 %15:tgpr = COPY %16:tgpr
rewr %bb.0 248B:1 %16:tgpr = COPY %17:tgpr
Tagging non-progress ranges: %17
queuing new interval: %16 [248r,328r:0) 0 at 248r weight:2.104167e-03
queuing new interval: %17 [232r,248r:0) 0 at 232r weight:3.641827e-03
selectOrSplit tGPR:%17 [232r,248r:0) 0 at 232r weight:3.641827e-03 w=3.641827e-03
hints: $r3
RS_Split2 Cascade 0
Analyze counted 3 instrs in 1 blocks, through 0 blocks.
tryLocalSplit: 232r 240r 248r
4 regmasks in block: 240r:232r-240r 240r:240r-248r
$r3 232r-240r i=INF extend
$r3 240r-248r i=INF end
$r0 232r-240r i=INF extend
$r0 240r-248r i=INF end
$r1 232r-240r i=INF extend
$r1 240r-248r i=INF end
$r2 232r-240r i=INF extend
$r2 240r-248r i=INF end
$r4 232r-240r i=6.575521e-03 extend
$r4 240r-248r i=6.575521e-03 end
$r5 232r-240r i=7.890625e-03 extend
$r5 240r-248r i=7.890625e-03 end
$r6 232r-240r i=5.738636e-03 extend
$r6 240r-248r i=5.738636e-03 end
Inline spilling tGPR:%17 [232r,248r:0) 0 at 232r weight:3.641827e-03
From original %3
also spill snippet %10 [228r,232r:0) 0 at 228r weight:INF
also spill snippet %16 [248r,328r:0) 0 at 248r weight:2.104167e-03
tBL 14, $noreg, &bar, implicit-def $lr, implicit $sp, implicit killed $r0, implicit killed $r1, implicit killed $r2
folded: 240r tBL 14, $noreg, &"\E0\9C\06\A0\FC\7F", implicit-def $lr, implicit $sp, implicit killed $r0, implicit killed $r1, implicit killed $r2 :: (load 4 from constant-pool)
remat: 324r %18:tgpr = tLDRpci %const.0, 14, $noreg :: (load 4 from constant-pool)
328e %15:tgpr = COPY killed %18:tgpr
All defs dead: dead %17:tgpr = COPY %10:tgpr
All defs dead: dead %10:tgpr = tLDRpci %const.0, 14, $noreg :: (load 4 from constant-pool)
All defs dead: dead %16:tgpr = COPY %17:tgpr
Remat created 3 dead defs.
Deleting dead def 248r dead %16:tgpr = COPY %17:tgpr
Deleting dead def 228r dead %10:tgpr = tLDRpci %const.0, 14, $noreg :: (load 4 from constant-pool)
unassigning %10 from $r3: R3
Deleting dead def 232r dead %17:tgpr = COPY %10:tgpr
Shrink: %10 EMPTY weight:INF
Shrunk: %10 EMPTY weight:INF
0 registers to spill after remat.
queuing new interval: %18 [324r,328r:0) 0 at 324r weight:INF
selectOrSplit tGPR:%18 [324r,328r:0) 0 at 324r weight:INF w=INF
hints: $r6
assigning %18 to $r6: R6 [324r,328r:0) 0 at 324r
Dropping unused %16 EMPTY weight:2.104167e-03
Dropping unused %10 EMPTY weight:INF
Trying to reconcile hints for: %0($r4)
%0($r4) is recolorable.
Trying to reconcile hints for: %1($r5)
%1($r5) is recolorable.
Trying to reconcile hints for: %2($r6)
%2($r6) is recolorable.
********** REWRITE VIRTUAL REGISTERS **********
********** Function: f
********** REGISTER MAP **********
[%0 -> $r4] tGPR
[%1 -> $r5] tGPR
[%2 -> $r6] tGPR
[%4 -> $r3] tGPR
[%12 -> $r3] tGPR
[%15 -> $r6] tGPR
[%18 -> $r6] tGPR
0B bb.0.entry:
liveins: $r0, $r1, $r2
16B %2:tgpr = COPY $r2
32B %1:tgpr = COPY $r1
48B %0:tgpr = COPY $r0
64B ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
96B $r0 = COPY %0:tgpr
112B $r1 = COPY %1:tgpr
128B $r2 = COPY %2:tgpr
144B tBL 14, $noreg, &"\94p\10\09", implicit-def $lr, implicit $sp, implicit killed $r0, implicit killed $r1, implicit killed $r2 :: (load 4 from constant-pool)
160B ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
176B ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
192B $r0 = COPY %0:tgpr
208B $r1 = COPY %2:tgpr
224B $r2 = COPY %1:tgpr
240B tBL 14, $noreg, &"", implicit-def $lr, implicit $sp, implicit killed $r0, implicit killed $r1, implicit killed $r2 :: (load 4 from constant-pool)
256B ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
272B ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
288B $r0 = COPY %1:tgpr
304B $r1 = COPY %0:tgpr
320B $r2 = COPY killed %2:tgpr
324B %18:tgpr = tLDRpci %const.0, 14, $noreg :: (load 4 from constant-pool)
328B %15:tgpr = COPY killed %18:tgpr
336B tBLXr 14, $noreg, %15:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
340B %12:tgpr = COPY killed %15:tgpr
344B %4:tgpr = COPY killed %12:tgpr
352B ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
368B ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
384B $r0 = COPY %1:tgpr
400B $r1 = COPY killed %1:tgpr
416B $r2 = COPY killed %0:tgpr
432B tBLXr 14, $noreg, killed %4:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
448B ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
464B tBX_RET 14, $noreg
> renamable $r6 = COPY $r2
> renamable $r5 = COPY $r1
> renamable $r4 = COPY $r0
> ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
> $r0 = COPY renamable $r4
> $r1 = COPY renamable $r5
> $r2 = COPY renamable $r6
> tBL 14, $noreg, &"\06", implicit-def $lr, implicit $sp, implicit killed $r0, implicit killed $r1, implicit killed $r2 :: (load 4 from constant-pool)
> ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
> ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
> $r0 = COPY renamable $r4
> $r1 = COPY renamable $r6
> $r2 = COPY renamable $r5
> tBL 14, $noreg, &"\06", implicit-def $lr, implicit $sp, implicit killed $r0, implicit killed $r1, implicit killed $r2 :: (load 4 from constant-pool)
> ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
> ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
> $r0 = COPY renamable $r5
> $r1 = COPY renamable $r4
> $r2 = COPY killed renamable $r6
> renamable $r6 = tLDRpci %const.0, 14, $noreg :: (load 4 from constant-pool)
> renamable $r6 = COPY killed renamable $r6
Identity copy: renamable $r6 = COPY killed renamable $r6
deleted.
> tBLXr 14, $noreg, renamable $r6, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
> renamable $r3 = COPY killed renamable $r6
> renamable $r3 = COPY killed renamable $r3
Identity copy: renamable $r3 = COPY killed renamable $r3
deleted.
> ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
> ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
> $r0 = COPY renamable $r5
> $r1 = COPY killed renamable $r5
> $r2 = COPY killed renamable $r4
> tBLXr 14, $noreg, killed renamable $r3, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
> ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
> tBX_RET 14, $noreg
-------------- next part --------------
A non-text attachment was scrubbed...
Name: llvm-611-2.diff
Type: text/x-patch
Size: 2620 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200415/aa64cfbd/attachment-0001.bin>
More information about the llvm-dev
mailing list