[llvm-dev] [ARM] Register pressure with -mthumb forces register reload before each call

Prathamesh Kulkarni via llvm-dev llvm-dev at lists.llvm.org
Tue Apr 14 17:44:32 PDT 2020


Hi,
I have attached WIP patch for adding foldMemoryOperand to Thumb1InstrInfo.
For the following case:

void f(int x, int y, int z)
{
  void bar(int, int, int);

  bar(x, y, z);
  bar(x, z, y);
  bar(y, x, z);
  bar(y, y, x);
}

it calls foldMemoryOperand twice, and thus converts two calls from blx to bl.
callMI->dump() shows the function name "bar" correctly, however in
generated assembly call to bar is garbled:
(compiled with -Oz --target=arm-linux-gnueabi -marcha=armv6-m):

        add     r7, sp, #16
        mov     r6, r2
        mov     r5, r1
        mov     r4, r0
        bl      "<90>w\n        "
        mov     r1, r2
        mov     r2, r5
        bl      "<90>w\n        "
        mov     r0, r5
        mov     r1, r4
        mov     r2, r6
        ldr     r6, .LCPI0_0
        blx     r6
        mov     r0, r5
        mov     r1, r5
        mov     r2, r4
        blx     r6

regalloc dump (attached) shows:
Inline spilling tGPR:%9 [80r,152r:0)  0 at 80r weight:3.209746e-03
>From original %3
        also spill snippet %8 [152r,232r:0)  0 at 152r weight:2.104167e-03
  tBL 14, $noreg, &bar, implicit-def $lr, implicit $sp, implicit
killed $r0, implicit killed $r1, implicit killed $r2
        folded:   144r  tBL 14, $noreg, &"\E0\9C\06\A0\FC\7F",
implicit-def $lr, implicit $sp, implicit killed $r0, implicit killed
$r1, implicit killed $r2 :: (load 4 from constant-pool)
        remat:  228r    %10:tgpr = tLDRpci %const.0, 14, $noreg ::
(load 4 from constant-pool)
                232e    %7:tgpr = COPY killed %10:tgpr

Could you please point out what am I doing wrong in the patch ?

Also, I guess, it only converted two calls to bl because further
spilling wasn't necessary.
However for above case, IIUC, we would want all calls to be converted to bl  ?
Since,
4 bl == 16 bytes
2 bl + 2 blx + 1 lr == 2 * 4 (bl) + 2 * 2 (blx) + 1 * 2 (ldr) + 4
bytes (litpool) == 18 bytes

Thanks,
Prathamesh




On Fri, 10 Apr 2020 at 04:22, Prathamesh Kulkarni
<prathamesh.kulkarni at linaro.org> wrote:
>
> Hi John,
> Thanks for the suggestions! I will start looking at adding
> foldMemoryOperand to ARMInstrInfo.
>
> Thanks,
> Prathamesh
>
> On Tue, 7 Apr 2020 at 23:55, John Brawn <John.Brawn at arm.com> wrote:
> >
> > If I'm understanding what's going on in this test correctly, what's happening is:
> >  * ARMTargetLowering::LowerCall prefers indirect calls when a function is called at least 3 times in minsize
> >  * In thumb 1 (without -fno-omit-frame-pointer) we have effectively only 3 callee-saved registers (r4-r6)
> >  * The function has three arguments, so those three plus the register we need to hold the function address is more than our callee-saved registers
> >  * Therefore something needs to be spilt
> >  * The function address can be rematerialized, so we spill that and insert and LDR before each call
> >
> > If we didn't have this spilling happening (e.g. if the function had one less argument) then the code size of using BL vs BLX
> >  * BL: 3*4-byte BL = 12 bytes
> >  * BX: 3*2-byte BX + 1*2-byte LDR + 4-byte litpool = 12 bytes
> > (So maybe even not considering spilling, LowerCall should be adjusted to do this for functions called 4 or more times)
> >
> > When we have to spill, if we compare spilling the functions address vs spilling an argument:
> >  * BX with spilt fn: 3*2-byte BX + 3*2-byte LDR + 4-byte litpool = 16 bytes
> >  * BX with spilt arg: 3*2-byte BX + 1*2-byte LDR + 4-byte litpool + 1*2-byte STR + 2*2-byte LDR = 18 bytes
> > So just changing the spilling heuristic won't work.
> >
> > The two ways I see of fixing this:
> >  * In LowerCall only prefer an indirect call if the number of integer register arguments is less than the number of callee-saved registers.
> >  * When the load of the function address is spilled, instead of just rematerializing the load instead convert the BX back into BL.
> >
> > The first of these would be easier, but there will be situations where we need to use less than three callee-saved registers (e.g. arguments are loaded from a pointer) and there are situations where we will spill the function address for reasons entirely unrelated to the function arguments (e.g. if we have enough live local variables).
> >
> > For the second, looking at InlineSpiller.cpp it does have the concept of rematerializing by folding a memory operand into another instruction, so I think we could make use of that to do this. It looks like it would involve adding a foldMemoryOperand function to ARMInstrInfo and then have this fold a LDR into a BX by turning it into a BL.
> >
> > John
> >
> > ________________________________
> > From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Prathamesh Kulkarni via llvm-dev <llvm-dev at lists.llvm.org>
> > Sent: 07 April 2020 21:07
> > To: llvm-dev at lists.llvm.org <llvm-dev at lists.llvm.org>
> > Subject: Re: [llvm-dev] [ARM] Register pressure with -mthumb forces register reload before each call
> >
> > On Tue, 31 Mar 2020 at 22:03, Prathamesh Kulkarni
> > <prathamesh.kulkarni at linaro.org> wrote:
> > >
> > > Hi,
> > > Compiling attached test-case, which is reduced version of of
> > > uECC_shared_secret from tinycrypt library [1], with
> > > --target=arm-linux-gnueabi -march=armv6-m -Oz -S
> > > results in reloading of register holding function's address before
> > > every call to blx:
> > >
> > >         ldr       r3, .LCPI0_0
> > >         blx      r3
> > >         mov    r0, r6
> > >         mov    r1, r5
> > >         mov    r2, r4
> > >         ldr       r3, .LCPI0_0
> > >         blx       r3
> > >         ldr        r3, .LCPI0_0
> > >         mov     r0, r6
> > >         mov     r1, r5
> > >         mov     r2, r4
> > >         blx       r3
> > >
> > > .LCPI0_0:
> > >         .long   foo
> > >
> > > From dump of regalloc (attached), AFAIU, what seems to happen during
> > > greedy allocator is, all virt regs %0 to %3 are live across first two
> > > calls to foo. Thus %0, %1 and %2 get assigned r6, r5 and r4
> > > respectively, and %3 which holds foo's address doesn't have any
> > > register left.
> > > Since it's live-range has least weight, it does not evict any existing interval,
> > > and gets split. Eventually we have the following allocation:
> > >
> > > [%0 -> $r6] tGPR
> > > [%1 -> $r5] tGPR
> > > [%2 -> $r4] tGPR
> > > [%6 -> $r3] tGPR
> > > [%11 -> $r3] tGPR
> > > [%16 -> $r3] tGPR
> > > [%17 -> $r3] tGPR
> > >
> > > where %6, %11, %16 and %17 all are derived from %3.
> > > And since r3 is a call-clobbered register, the compiler is forced to
> > > reload foo's address
> > > each time before blx.
> > >
> > > To fix this, I thought of following approaches:
> > > (a) Disable the heuristic to prefer indirect call when there are at
> > > least 3 calls to
> > > same function in basic block in ARMTargetLowering::LowerCall for Thumb-1 ISA.
> > >
> > > (b) In ARMTargetLowering::LowerCall, put another constraint like
> > > number of arguments, as a proxy for register pressure for Thumb-1, but
> > > that's bound to trip another cases.
> > >
> > > (c) Give higher priority to allocate vrit reg used for indirect calls
> > > ? However, if that
> > > results in spilling of some other register, it would defeat the
> > > purpose of saving code-size. I suppose ideally we want to trigger the
> > > heuristic of using indirect call only when we know beforehand that it
> > > will not result in spilling. But I am not sure if it's possible to
> > > estimate that during isel ?
> > >
> > > I would be grateful for suggestions on how to proceed further.
> > ping ?
> >
> > Thanks,
> > Prathamesh
> > >
> > > [1] https://github.com/intel/tinycrypt/blob/master/lib/source/ecc_dh.c#L139
> > >
> > > Thanks,
> > > Prathamesh
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
Computing live-in reg-units in ABI blocks.
0B	%bb.0 R0#0 R1#0 R2#0
Created 3 new intervals.
********** INTERVALS **********
R0 [0B,48r:0)[96r,144r:4)[192r,240r:3)[288r,336r:2)[384r,432r:1)  0 at 0B-phi 1 at 384r 2 at 288r 3 at 192r 4 at 96r
R1 [0B,32r:0)[112r,144r:4)[208r,240r:3)[304r,336r:2)[400r,432r:1)  0 at 0B-phi 1 at 400r 2 at 304r 3 at 208r 4 at 112r
R2 [0B,16r:0)[128r,144r:4)[224r,240r:3)[320r,336r:2)[416r,432r:1)  0 at 0B-phi 1 at 416r 2 at 320r 3 at 224r 4 at 128r
%0 [48r,416r:0)  0 at 48r weight:0.000000e+00
%1 [32r,400r:0)  0 at 32r weight:0.000000e+00
%2 [16r,320r:0)  0 at 16r weight:0.000000e+00
%3 [80r,432r:0)  0 at 80r weight:0.000000e+00
RegMasks: 144r 240r 336r 432r
********** MACHINEINSTRS **********
# Machine code for function f: NoPHIs, TracksLiveness
Constant Pool:
  cp#0: @bar, align=4
Function Live Ins: $r0 in %0, $r1 in %1, $r2 in %2

0B	bb.0.entry:
	  liveins: $r0, $r1, $r2
16B	  %2:tgpr = COPY $r2
32B	  %1:tgpr = COPY $r1
48B	  %0:tgpr = COPY $r0
64B	  ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
80B	  %3:tgpr = tLDRpci %const.0, 14, $noreg :: (load 4 from constant-pool)
96B	  $r0 = COPY %0:tgpr
112B	  $r1 = COPY %1:tgpr
128B	  $r2 = COPY %2:tgpr
144B	  tBLXr 14, $noreg, %3:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
160B	  ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
176B	  ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
192B	  $r0 = COPY %0:tgpr
208B	  $r1 = COPY %2:tgpr
224B	  $r2 = COPY %1:tgpr
240B	  tBLXr 14, $noreg, %3:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
256B	  ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
272B	  ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
288B	  $r0 = COPY %1:tgpr
304B	  $r1 = COPY %0:tgpr
320B	  $r2 = COPY %2:tgpr
336B	  tBLXr 14, $noreg, %3:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
352B	  ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
368B	  ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
384B	  $r0 = COPY %1:tgpr
400B	  $r1 = COPY %1:tgpr
416B	  $r2 = COPY %0:tgpr
432B	  tBLXr 14, $noreg, %3:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
448B	  ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
464B	  tBX_RET 14, $noreg

# End machine code for function f.

********** SIMPLE REGISTER COALESCING **********
********** Function: f
********** JOINING INTERVALS ***********
entry:
16B	%2:tgpr = COPY $r2
	Considering merging %2 with $r2
	Can only merge into reserved registers.
32B	%1:tgpr = COPY $r1
	Considering merging %1 with $r1
	Can only merge into reserved registers.
48B	%0:tgpr = COPY $r0
	Considering merging %0 with $r0
	Can only merge into reserved registers.
96B	$r0 = COPY %0:tgpr
	Considering merging %0 with $r0
	Can only merge into reserved registers.
112B	$r1 = COPY %1:tgpr
	Considering merging %1 with $r1
	Can only merge into reserved registers.
128B	$r2 = COPY %2:tgpr
	Considering merging %2 with $r2
	Can only merge into reserved registers.
192B	$r0 = COPY %0:tgpr
	Considering merging %0 with $r0
	Can only merge into reserved registers.
208B	$r1 = COPY %2:tgpr
	Considering merging %2 with $r1
	Can only merge into reserved registers.
224B	$r2 = COPY %1:tgpr
	Considering merging %1 with $r2
	Can only merge into reserved registers.
288B	$r0 = COPY %1:tgpr
	Considering merging %1 with $r0
	Can only merge into reserved registers.
304B	$r1 = COPY %0:tgpr
	Considering merging %0 with $r1
	Can only merge into reserved registers.
320B	$r2 = COPY %2:tgpr
	Considering merging %2 with $r2
	Can only merge into reserved registers.
384B	$r0 = COPY %1:tgpr
	Considering merging %1 with $r0
	Can only merge into reserved registers.
400B	$r1 = COPY %1:tgpr
	Considering merging %1 with $r1
	Can only merge into reserved registers.
416B	$r2 = COPY %0:tgpr
	Considering merging %0 with $r2
	Can only merge into reserved registers.
96B	$r0 = COPY %0:tgpr
	Considering merging %0 with $r0
	Can only merge into reserved registers.
112B	$r1 = COPY %1:tgpr
	Considering merging %1 with $r1
	Can only merge into reserved registers.
128B	$r2 = COPY %2:tgpr
	Considering merging %2 with $r2
	Can only merge into reserved registers.
192B	$r0 = COPY %0:tgpr
	Considering merging %0 with $r0
	Can only merge into reserved registers.
208B	$r1 = COPY %2:tgpr
	Considering merging %2 with $r1
	Can only merge into reserved registers.
224B	$r2 = COPY %1:tgpr
	Considering merging %1 with $r2
	Can only merge into reserved registers.
288B	$r0 = COPY %1:tgpr
	Considering merging %1 with $r0
	Can only merge into reserved registers.
304B	$r1 = COPY %0:tgpr
	Considering merging %0 with $r1
	Can only merge into reserved registers.
320B	$r2 = COPY %2:tgpr
	Considering merging %2 with $r2
	Can only merge into reserved registers.
384B	$r0 = COPY %1:tgpr
	Considering merging %1 with $r0
	Can only merge into reserved registers.
400B	$r1 = COPY %1:tgpr
	Considering merging %1 with $r1
	Can only merge into reserved registers.
416B	$r2 = COPY %0:tgpr
	Considering merging %0 with $r2
	Can only merge into reserved registers.
Trying to inflate 0 regs.
********** INTERVALS **********
R0 [0B,48r:0)[96r,144r:4)[192r,240r:3)[288r,336r:2)[384r,432r:1)  0 at 0B-phi 1 at 384r 2 at 288r 3 at 192r 4 at 96r
R1 [0B,32r:0)[112r,144r:4)[208r,240r:3)[304r,336r:2)[400r,432r:1)  0 at 0B-phi 1 at 400r 2 at 304r 3 at 208r 4 at 112r
R2 [0B,16r:0)[128r,144r:4)[224r,240r:3)[320r,336r:2)[416r,432r:1)  0 at 0B-phi 1 at 416r 2 at 320r 3 at 224r 4 at 128r
%0 [48r,416r:0)  0 at 48r weight:0.000000e+00
%1 [32r,400r:0)  0 at 32r weight:0.000000e+00
%2 [16r,320r:0)  0 at 16r weight:0.000000e+00
%3 [80r,432r:0)  0 at 80r weight:0.000000e+00
RegMasks: 144r 240r 336r 432r
********** MACHINEINSTRS **********
# Machine code for function f: NoPHIs, TracksLiveness
Constant Pool:
  cp#0: @bar, align=4
Function Live Ins: $r0 in %0, $r1 in %1, $r2 in %2

0B	bb.0.entry:
	  liveins: $r0, $r1, $r2
16B	  %2:tgpr = COPY $r2
32B	  %1:tgpr = COPY $r1
48B	  %0:tgpr = COPY $r0
64B	  ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
80B	  %3:tgpr = tLDRpci %const.0, 14, $noreg :: (load 4 from constant-pool)
96B	  $r0 = COPY %0:tgpr
112B	  $r1 = COPY %1:tgpr
128B	  $r2 = COPY %2:tgpr
144B	  tBLXr 14, $noreg, %3:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
160B	  ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
176B	  ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
192B	  $r0 = COPY %0:tgpr
208B	  $r1 = COPY %2:tgpr
224B	  $r2 = COPY %1:tgpr
240B	  tBLXr 14, $noreg, %3:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
256B	  ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
272B	  ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
288B	  $r0 = COPY %1:tgpr
304B	  $r1 = COPY %0:tgpr
320B	  $r2 = COPY %2:tgpr
336B	  tBLXr 14, $noreg, %3:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
352B	  ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
368B	  ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
384B	  $r0 = COPY %1:tgpr
400B	  $r1 = COPY %1:tgpr
416B	  $r2 = COPY %0:tgpr
432B	  tBLXr 14, $noreg, %3:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
448B	  ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
464B	  tBX_RET 14, $noreg

# End machine code for function f.

********** GREEDY REGISTER ALLOCATION **********
********** Function: f
********** INTERVALS **********
R0 [0B,48r:0)[96r,144r:4)[192r,240r:3)[288r,336r:2)[384r,432r:1)  0 at 0B-phi 1 at 384r 2 at 288r 3 at 192r 4 at 96r
R1 [0B,32r:0)[112r,144r:4)[208r,240r:3)[304r,336r:2)[400r,432r:1)  0 at 0B-phi 1 at 400r 2 at 304r 3 at 208r 4 at 112r
R2 [0B,16r:0)[128r,144r:4)[224r,240r:3)[320r,336r:2)[416r,432r:1)  0 at 0B-phi 1 at 416r 2 at 320r 3 at 224r 4 at 128r
%0 [48r,416r:0)  0 at 48r weight:6.575521e-03
%1 [32r,400r:0)  0 at 32r weight:7.890625e-03
%2 [16r,320r:0)  0 at 16r weight:5.738636e-03
%3 [80r,432r:0)  0 at 80r weight:3.324468e-03
RegMasks: 144r 240r 336r 432r
********** MACHINEINSTRS **********
# Machine code for function f: NoPHIs, TracksLiveness
Constant Pool:
  cp#0: @bar, align=4
Function Live Ins: $r0 in %0, $r1 in %1, $r2 in %2

0B	bb.0.entry:
	  liveins: $r0, $r1, $r2
16B	  %2:tgpr = COPY $r2
32B	  %1:tgpr = COPY $r1
48B	  %0:tgpr = COPY $r0
64B	  ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
80B	  %3:tgpr = tLDRpci %const.0, 14, $noreg :: (load 4 from constant-pool)
96B	  $r0 = COPY %0:tgpr
112B	  $r1 = COPY %1:tgpr
128B	  $r2 = COPY %2:tgpr
144B	  tBLXr 14, $noreg, %3:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
160B	  ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
176B	  ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
192B	  $r0 = COPY %0:tgpr
208B	  $r1 = COPY %2:tgpr
224B	  $r2 = COPY %1:tgpr
240B	  tBLXr 14, $noreg, %3:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
256B	  ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
272B	  ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
288B	  $r0 = COPY %1:tgpr
304B	  $r1 = COPY %0:tgpr
320B	  $r2 = COPY %2:tgpr
336B	  tBLXr 14, $noreg, %3:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
352B	  ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
368B	  ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
384B	  $r0 = COPY %1:tgpr
400B	  $r1 = COPY %1:tgpr
416B	  $r2 = COPY %0:tgpr
432B	  tBLXr 14, $noreg, %3:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
448B	  ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
464B	  tBX_RET 14, $noreg

# End machine code for function f.


selectOrSplit tGPR:%0 [48r,416r:0)  0 at 48r weight:6.575521e-03 w=6.575521e-03
AllocationOrder(tGPR) = [ $r0 $r1 $r2 $r3 $r4 $r5 $r6 ]
hints: $r0 $r1 $r2
missed hint $r0
assigning %0 to $r4: R4 [48r,416r:0)  0 at 48r

selectOrSplit tGPR:%1 [32r,400r:0)  0 at 32r weight:7.890625e-03 w=7.890625e-03
hints: $r1 $r0 $r2
missed hint $r1
assigning %1 to $r5: R5 [32r,400r:0)  0 at 32r

selectOrSplit tGPR:%2 [16r,320r:0)  0 at 16r weight:5.738636e-03 w=5.738636e-03
hints: $r2 $r1
missed hint $r2
assigning %2 to $r6: R6 [16r,320r:0)  0 at 16r

selectOrSplit tGPR:%3 [80r,432r:0)  0 at 80r weight:3.324468e-03 w=3.324468e-03
RS_Assign Cascade 0
wait for second round
queuing new interval: %3 [80r,432r:0)  0 at 80r weight:3.324468e-03

selectOrSplit tGPR:%3 [80r,432r:0)  0 at 80r weight:3.324468e-03 w=3.324468e-03
RS_Split Cascade 0
Analyze counted 5 instrs in 1 blocks, through 0 blocks.
tryLocalSplit:  80r 144r 240r 336r 432r
4 regmasks in block: 144r:80r-144r 144r:144r-240r 240r:240r-336r 336r:336r-432r
$r0 80r-144r i=INF extend
$r0 144r-240r i=INF extend
$r0 240r-336r i=INF extend
$r0 336r-432r i=INF end
$r1 80r-144r i=INF extend
$r1 144r-240r i=INF extend
$r1 240r-336r i=INF extend
$r1 336r-432r i=INF end
$r2 80r-144r i=INF extend
$r2 144r-240r i=INF extend
$r2 240r-336r i=INF extend
$r2 336r-432r i=INF end
$r3 80r-144r i=INF extend
$r3 144r-240r i=INF extend
$r3 240r-336r i=INF extend
$r3 336r-432r i=INF end
$r4 80r-144r i=6.575521e-03 w=6.250000e-03 extend
$r4 144r-240r i=6.575521e-03 w=7.575758e-03 (best) extend
$r4 144r-336r i=6.575521e-03 w=8.012821e-03 (best) extend
$r4 144r-432r i=6.575521e-03 w=7.102273e-03 end
$r5 80r-144r i=7.890625e-03 w=6.250000e-03 extend
$r5 144r-240r i=7.890625e-03 w=7.575758e-03 extend
$r5 240r-336r i=7.890625e-03 w=7.575758e-03 extend
$r5 336r-432r i=7.890625e-03 w=5.859375e-03 end
$r6 80r-144r i=5.738636e-03 w=6.250000e-03 extend
$r6 80r-240r i=5.738636e-03 w=6.944444e-03 extend
$r6 80r-336r i=5.738636e-03 w=7.440476e-03 (best) extend
$r6 80r-432r i=5.738636e-03 all
Best local split range: 80r-336r, 1.667770e-03, 4 instrs
    enterIntvBefore 80r: not live
    leaveIntvAfter 336r: valno 0
    useIntv [80B;344r): [80B;344r):1
  blit [80r,432r:0): [80r;344r)=1(%5):0 [344r;432r)=0(%4):0
  rewr %bb.0	80r:1	%5:tgpr = tLDRpci %const.0, 14, $noreg :: (load 4 from constant-pool)
  rewr %bb.0	144B:1	tBLXr 14, $noreg, %5:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
  rewr %bb.0	240B:1	tBLXr 14, $noreg, %5:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
  rewr %bb.0	336B:1	tBLXr 14, $noreg, %5:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
  rewr %bb.0	432B:0	tBLXr 14, $noreg, %4:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
  rewr %bb.0	344B:1	%4:tgpr = COPY %5:tgpr
Tagging non-progress ranges: %5
queuing new interval: %4 [344r,432r:0)  0 at 344r weight:2.069672e-03
queuing new interval: %5 [80r,344r:0)  0 at 80r weight:3.802711e-03

selectOrSplit tGPR:%5 [80r,344r:0)  0 at 80r weight:3.802711e-03 w=3.802711e-03
RS_Split2 Cascade 0
Analyze counted 5 instrs in 1 blocks, through 0 blocks.
tryLocalSplit:  80r 144r 240r 336r 344r
4 regmasks in block: 144r:80r-144r 144r:144r-240r 240r:240r-336r 336r:336r-344r
$r0 80r-144r i=INF extend
$r0 144r-240r i=INF extend
$r0 240r-336r i=INF extend
$r0 336r-344r i=INF end
$r1 80r-144r i=INF extend
$r1 144r-240r i=INF extend
$r1 240r-336r i=INF extend
$r1 336r-344r i=INF end
$r2 80r-144r i=INF extend
$r2 144r-240r i=INF extend
$r2 240r-336r i=INF extend
$r2 336r-344r i=INF end
$r3 80r-144r i=INF extend
$r3 144r-240r i=INF extend
$r3 240r-336r i=INF extend
$r3 336r-344r i=INF end
$r4 80r-144r i=6.575521e-03 w=6.250000e-03 extend
$r4 144r-240r i=6.575521e-03 w=7.575758e-03 (best) extend
$r4 144r-336r i=6.575521e-03 shrink
$r4 240r-336r i=6.575521e-03 w=7.575758e-03 (best) extend
$r4 240r-344r i=6.575521e-03 w=7.692308e-03 (best) end
$r5 80r-144r i=7.890625e-03 w=6.250000e-03 extend
$r5 144r-240r i=7.890625e-03 w=7.575758e-03 extend
$r5 240r-336r i=7.890625e-03 w=7.575758e-03 extend
$r5 336r-344r i=7.890625e-03 w=7.075472e-03 end
$r6 80r-144r i=5.738636e-03 w=6.250000e-03 extend
$r6 80r-240r i=5.738636e-03 w=6.944444e-03 (best) extend
$r6 80r-336r i=5.738636e-03 shrink
$r6 144r-336r i=5.738636e-03 shrink
$r6 240r-336r i=5.738636e-03 w=7.575758e-03 (best) extend
$r6 240r-344r i=5.738636e-03 w=7.692308e-03 (best) end
Best local split range: 240r-344r, 1.914560e-03, 3 instrs
    enterIntvBefore 240r: valno 0
    leaveIntvAfter 344r: not live
    useIntv [232r;352B): [232r;352B):1
  blit [80r,344r:0): [80r;232r)=0(%6):0 [232r;344r)=1(%7):0
  rewr %bb.0	80r:0	%6:tgpr = tLDRpci %const.0, 14, $noreg :: (load 4 from constant-pool)
  rewr %bb.0	144B:0	tBLXr 14, $noreg, %6:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
  rewr %bb.0	240B:1	tBLXr 14, $noreg, %7:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
  rewr %bb.0	336B:1	tBLXr 14, $noreg, %7:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
  rewr %bb.0	344B:1	%4:tgpr = COPY %7:tgpr
  rewr %bb.0	232B:0	%7:tgpr = COPY %6:tgpr
queuing new interval: %6 [80r,232r:0)  0 at 80r weight:2.744565e-03
queuing new interval: %7 [232r,344r:0)  0 at 232r weight:3.945312e-03

selectOrSplit tGPR:%6 [80r,232r:0)  0 at 80r weight:2.744565e-03 w=2.744565e-03
RS_Assign Cascade 0
wait for second round
queuing new interval: %6 [80r,232r:0)  0 at 80r weight:2.744565e-03

selectOrSplit tGPR:%7 [232r,344r:0)  0 at 232r weight:3.945312e-03 w=3.945312e-03
RS_Assign Cascade 0
wait for second round
queuing new interval: %7 [232r,344r:0)  0 at 232r weight:3.945312e-03

selectOrSplit tGPR:%4 [344r,432r:0)  0 at 344r weight:2.069672e-03 w=2.069672e-03
assigning %4 to $r3: R3 [344r,432r:0)  0 at 344r

selectOrSplit tGPR:%6 [80r,232r:0)  0 at 80r weight:2.744565e-03 w=2.744565e-03
RS_Split Cascade 0
Analyze counted 3 instrs in 1 blocks, through 0 blocks.
tryLocalSplit:  80r 144r 232r
4 regmasks in block: 144r:80r-144r 144r:144r-232r
$r0 80r-144r i=INF extend
$r0 144r-232r i=INF end
$r1 80r-144r i=INF extend
$r1 144r-232r i=INF end
$r2 80r-144r i=INF extend
$r2 144r-232r i=INF end
$r3 80r-144r i=INF extend
$r3 144r-232r i=INF end
$r4 80r-144r i=6.575521e-03 w=6.250000e-03 extend
$r4 144r-232r i=6.575521e-03 w=5.952381e-03 end
$r5 80r-144r i=7.890625e-03 w=6.250000e-03 extend
$r5 144r-232r i=7.890625e-03 w=5.952381e-03 end
$r6 80r-144r i=5.738636e-03 w=6.250000e-03 (best) extend
$r6 80r-232r i=5.738636e-03 all
Best local split range: 80r-144r, 5.011263e-04, 2 instrs
    enterIntvBefore 80r: not live
    leaveIntvAfter 144r: valno 0
    useIntv [80B;152r): [80B;152r):1
  blit [80r,232r:0): [80r;152r)=1(%9):0 [152r;232r)=0(%8):0
  rewr %bb.0	80r:1	%9:tgpr = tLDRpci %const.0, 14, $noreg :: (load 4 from constant-pool)
  rewr %bb.0	144B:1	tBLXr 14, $noreg, %9:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
  rewr %bb.0	232B:0	%7:tgpr = COPY %8:tgpr
  rewr %bb.0	152B:1	%8:tgpr = COPY %9:tgpr
Tagging non-progress ranges: %9
queuing new interval: %8 [152r,232r:0)  0 at 152r weight:2.104167e-03
queuing new interval: %9 [80r,152r:0)  0 at 80r weight:3.209746e-03

selectOrSplit tGPR:%9 [80r,152r:0)  0 at 80r weight:3.209746e-03 w=3.209746e-03
RS_Split2 Cascade 0
Analyze counted 3 instrs in 1 blocks, through 0 blocks.
tryLocalSplit:  80r 144r 152r
4 regmasks in block: 144r:80r-144r 144r:144r-152r
$r0 80r-144r i=INF extend
$r0 144r-152r i=INF end
$r1 80r-144r i=INF extend
$r1 144r-152r i=INF end
$r2 80r-144r i=INF extend
$r2 144r-152r i=INF end
$r3 80r-144r i=INF extend
$r3 144r-152r i=INF end
$r4 80r-144r i=6.575521e-03 extend
$r4 144r-152r i=6.575521e-03 end
$r5 80r-144r i=7.890625e-03 extend
$r5 144r-152r i=7.890625e-03 end
$r6 80r-144r i=5.738636e-03 extend
$r6 144r-152r i=5.738636e-03 end
Inline spilling tGPR:%9 [80r,152r:0)  0 at 80r weight:3.209746e-03
From original %3
	also spill snippet %8 [152r,232r:0)  0 at 152r weight:2.104167e-03
  tBL 14, $noreg, &bar, implicit-def $lr, implicit $sp, implicit killed $r0, implicit killed $r1, implicit killed $r2
	folded:   144r	tBL 14, $noreg, &"\E0\9C\06\A0\FC\7F", implicit-def $lr, implicit $sp, implicit killed $r0, implicit killed $r1, implicit killed $r2 :: (load 4 from constant-pool)
	remat:  228r	%10:tgpr = tLDRpci %const.0, 14, $noreg :: (load 4 from constant-pool)
	        232e	%7:tgpr = COPY killed %10:tgpr

All defs dead: dead %9:tgpr = tLDRpci %const.0, 14, $noreg :: (load 4 from constant-pool)
All defs dead: dead %8:tgpr = COPY %9:tgpr
Remat created 2 dead defs.
Deleting dead def 152r	dead %8:tgpr = COPY %9:tgpr
Deleting dead def 80r	dead %9:tgpr = tLDRpci %const.0, 14, $noreg :: (load 4 from constant-pool)
0 registers to spill after remat.
queuing new interval: %10 [228r,232r:0)  0 at 228r weight:INF

selectOrSplit tGPR:%10 [228r,232r:0)  0 at 228r weight:INF w=INF
assigning %10 to $r3: R3 [228r,232r:0)  0 at 228r
Dropping unused %8 EMPTY weight:2.104167e-03

selectOrSplit tGPR:%7 [232r,344r:0)  0 at 232r weight:3.945312e-03 w=3.945312e-03
hints: $r3
RS_Split Cascade 0
Analyze counted 4 instrs in 1 blocks, through 0 blocks.
tryLocalSplit:  232r 240r 336r 344r
4 regmasks in block: 240r:232r-240r 240r:240r-336r 336r:336r-344r
$r3 232r-240r i=INF extend
$r3 240r-336r i=INF extend
$r3 336r-344r i=INF end
$r0 232r-240r i=INF extend
$r0 240r-336r i=INF extend
$r0 336r-344r i=INF end
$r1 232r-240r i=INF extend
$r1 240r-336r i=INF extend
$r1 336r-344r i=INF end
$r2 232r-240r i=INF extend
$r2 240r-336r i=INF extend
$r2 336r-344r i=INF end
$r4 232r-240r i=6.575521e-03 w=7.075472e-03 (best) extend
$r4 232r-336r i=6.575521e-03 w=7.692308e-03 (best) extend
$r4 232r-344r i=6.575521e-03 all
$r5 232r-240r i=7.890625e-03 w=7.075472e-03 extend
$r5 240r-336r i=7.890625e-03 w=7.575758e-03 extend
$r5 336r-344r i=7.890625e-03 w=7.075472e-03 end
$r6 232r-240r i=5.738636e-03 w=7.075472e-03 (best) extend
$r6 232r-336r i=5.738636e-03 w=7.692308e-03 (best) extend
$r6 232r-344r i=5.738636e-03 all
Best local split range: 232r-336r, 1.914560e-03, 3 instrs
    enterIntvBefore 232r: not live
    leaveIntvAfter 336r: valno 0
    useIntv [232B;340r): [232B;340r):1
  blit [232r,344r:0): [232r;340r)=1(%13):0 [340r;344r)=0(%12):0
  rewr %bb.0	232r:1	%13:tgpr = COPY %10:tgpr
  rewr %bb.0	240B:1	tBLXr 14, $noreg, %13:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
  rewr %bb.0	336B:1	tBLXr 14, $noreg, %13:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
  rewr %bb.0	344B:0	%4:tgpr = COPY %12:tgpr
  rewr %bb.0	340B:1	%12:tgpr = COPY %13:tgpr
Tagging non-progress ranges: %13
queuing new interval: %12 [340r,344r:0)  0 at 340r weight:INF
queuing new interval: %13 [232r,340r:0)  0 at 232r weight:3.976378e-03

selectOrSplit tGPR:%13 [232r,340r:0)  0 at 232r weight:3.976378e-03 w=3.976378e-03
hints: $r3
RS_Split2 Cascade 0
Analyze counted 4 instrs in 1 blocks, through 0 blocks.
tryLocalSplit:  232r 240r 336r 340r
4 regmasks in block: 240r:232r-240r 240r:240r-336r 336r:336r-340r
$r3 232r-240r i=INF extend
$r3 240r-336r i=INF extend
$r3 336r-340r i=INF end
$r0 232r-240r i=INF extend
$r0 240r-336r i=INF extend
$r0 336r-340r i=INF end
$r1 232r-240r i=INF extend
$r1 240r-336r i=INF extend
$r1 336r-340r i=INF end
$r2 232r-240r i=INF extend
$r2 240r-336r i=INF extend
$r2 336r-340r i=INF end
$r4 232r-240r i=6.575521e-03 w=7.075472e-03 (best) extend
$r4 232r-336r i=6.575521e-03 shrink
$r4 240r-336r i=6.575521e-03 extend
$r4 336r-340r i=6.575521e-03 w=7.142857e-03 (best) end
$r5 232r-240r i=7.890625e-03 w=7.075472e-03 extend
$r5 240r-336r i=7.890625e-03 extend
$r5 336r-340r i=7.890625e-03 w=7.142857e-03 end
$r6 232r-240r i=5.738636e-03 w=7.075472e-03 (best) extend
$r6 232r-336r i=5.738636e-03 shrink
$r6 240r-336r i=5.738636e-03 extend
$r6 336r-340r i=0.000000e+00 w=7.142857e-03 (best) end
Best local split range: 336r-340r, 6.999861e-03, 2 instrs
    enterIntvBefore 336r: valno 0
    leaveIntvAfter 340r: not live
    useIntv [328r;344B): [328r;344B):1
  blit [232r,340r:0): [232r;328r)=0(%14):0 [328r;340r)=1(%15):0
  rewr %bb.0	232r:0	%14:tgpr = COPY %10:tgpr
  rewr %bb.0	240B:0	tBLXr 14, $noreg, %14:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
  rewr %bb.0	336B:1	tBLXr 14, $noreg, %15:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
  rewr %bb.0	340B:1	%12:tgpr = COPY %15:tgpr
  rewr %bb.0	328B:0	%15:tgpr = COPY %14:tgpr
queuing new interval: %14 [232r,328r:0)  0 at 232r weight:3.054435e-03
queuing new interval: %15 [328r,340r:0)  0 at 328r weight:3.677184e-03

selectOrSplit tGPR:%14 [232r,328r:0)  0 at 232r weight:3.054435e-03 w=3.054435e-03
hints: $r3
RS_Assign Cascade 0
wait for second round
queuing new interval: %14 [232r,328r:0)  0 at 232r weight:3.054435e-03

selectOrSplit tGPR:%12 [340r,344r:0)  0 at 340r weight:INF w=INF
hints: $r3
assigning %12 to $r3: R3 [340r,344r:0)  0 at 340r

selectOrSplit tGPR:%15 [328r,340r:0)  0 at 328r weight:3.677184e-03 w=3.677184e-03
hints: $r3
assigning %15 to $r6: R6 [328r,340r:0)  0 at 328r

selectOrSplit tGPR:%14 [232r,328r:0)  0 at 232r weight:3.054435e-03 w=3.054435e-03
hints: $r3 $r6
RS_Split Cascade 0
Analyze counted 3 instrs in 1 blocks, through 0 blocks.
tryLocalSplit:  232r 240r 328r
4 regmasks in block: 240r:232r-240r 240r:240r-328r
$r3 232r-240r i=INF extend
$r3 240r-328r i=INF end
$r6 232r-240r i=5.738636e-03 w=7.075472e-03 (best) extend
$r6 232r-328r i=5.738636e-03 all
$r0 232r-240r i=INF extend
$r0 240r-328r i=INF end
$r1 232r-240r i=INF extend
$r1 240r-328r i=INF end
$r2 232r-240r i=INF extend
$r2 240r-328r i=INF end
$r4 232r-240r i=6.575521e-03 w=7.075472e-03 extend
$r4 232r-328r i=6.575521e-03 all
$r5 232r-240r i=7.890625e-03 w=7.075472e-03 extend
$r5 240r-328r i=7.890625e-03 w=5.952381e-03 end
Best local split range: 232r-240r, 1.310072e-03, 2 instrs
    enterIntvBefore 232r: not live
    leaveIntvAfter 240r: valno 0
    useIntv [232B;248r): [232B;248r):1
  blit [232r,328r:0): [232r;248r)=1(%17):0 [248r;328r)=0(%16):0
  rewr %bb.0	232r:1	%17:tgpr = COPY %10:tgpr
  rewr %bb.0	240B:1	tBLXr 14, $noreg, %17:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
  rewr %bb.0	328B:0	%15:tgpr = COPY %16:tgpr
  rewr %bb.0	248B:1	%16:tgpr = COPY %17:tgpr
Tagging non-progress ranges: %17
queuing new interval: %16 [248r,328r:0)  0 at 248r weight:2.104167e-03
queuing new interval: %17 [232r,248r:0)  0 at 232r weight:3.641827e-03

selectOrSplit tGPR:%17 [232r,248r:0)  0 at 232r weight:3.641827e-03 w=3.641827e-03
hints: $r3
RS_Split2 Cascade 0
Analyze counted 3 instrs in 1 blocks, through 0 blocks.
tryLocalSplit:  232r 240r 248r
4 regmasks in block: 240r:232r-240r 240r:240r-248r
$r3 232r-240r i=INF extend
$r3 240r-248r i=INF end
$r0 232r-240r i=INF extend
$r0 240r-248r i=INF end
$r1 232r-240r i=INF extend
$r1 240r-248r i=INF end
$r2 232r-240r i=INF extend
$r2 240r-248r i=INF end
$r4 232r-240r i=6.575521e-03 extend
$r4 240r-248r i=6.575521e-03 end
$r5 232r-240r i=7.890625e-03 extend
$r5 240r-248r i=7.890625e-03 end
$r6 232r-240r i=5.738636e-03 extend
$r6 240r-248r i=5.738636e-03 end
Inline spilling tGPR:%17 [232r,248r:0)  0 at 232r weight:3.641827e-03
From original %3
	also spill snippet %10 [228r,232r:0)  0 at 228r weight:INF
	also spill snippet %16 [248r,328r:0)  0 at 248r weight:2.104167e-03
  tBL 14, $noreg, &bar, implicit-def $lr, implicit $sp, implicit killed $r0, implicit killed $r1, implicit killed $r2
	folded:   240r	tBL 14, $noreg, &"\E0\9C\06\A0\FC\7F", implicit-def $lr, implicit $sp, implicit killed $r0, implicit killed $r1, implicit killed $r2 :: (load 4 from constant-pool)
	remat:  324r	%18:tgpr = tLDRpci %const.0, 14, $noreg :: (load 4 from constant-pool)
	        328e	%15:tgpr = COPY killed %18:tgpr

All defs dead: dead %17:tgpr = COPY %10:tgpr
All defs dead: dead %10:tgpr = tLDRpci %const.0, 14, $noreg :: (load 4 from constant-pool)
All defs dead: dead %16:tgpr = COPY %17:tgpr
Remat created 3 dead defs.
Deleting dead def 248r	dead %16:tgpr = COPY %17:tgpr
Deleting dead def 228r	dead %10:tgpr = tLDRpci %const.0, 14, $noreg :: (load 4 from constant-pool)
unassigning %10 from $r3: R3
Deleting dead def 232r	dead %17:tgpr = COPY %10:tgpr
Shrink: %10 EMPTY weight:INF
Shrunk: %10 EMPTY weight:INF
0 registers to spill after remat.
queuing new interval: %18 [324r,328r:0)  0 at 324r weight:INF

selectOrSplit tGPR:%18 [324r,328r:0)  0 at 324r weight:INF w=INF
hints: $r6
assigning %18 to $r6: R6 [324r,328r:0)  0 at 324r
Dropping unused %16 EMPTY weight:2.104167e-03
Dropping unused %10 EMPTY weight:INF
Trying to reconcile hints for: %0($r4)
%0($r4) is recolorable.
Trying to reconcile hints for: %1($r5)
%1($r5) is recolorable.
Trying to reconcile hints for: %2($r6)
%2($r6) is recolorable.
********** REWRITE VIRTUAL REGISTERS **********
********** Function: f
********** REGISTER MAP **********
[%0 -> $r4] tGPR
[%1 -> $r5] tGPR
[%2 -> $r6] tGPR
[%4 -> $r3] tGPR
[%12 -> $r3] tGPR
[%15 -> $r6] tGPR
[%18 -> $r6] tGPR

0B	bb.0.entry:
	  liveins: $r0, $r1, $r2
16B	  %2:tgpr = COPY $r2
32B	  %1:tgpr = COPY $r1
48B	  %0:tgpr = COPY $r0
64B	  ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
96B	  $r0 = COPY %0:tgpr
112B	  $r1 = COPY %1:tgpr
128B	  $r2 = COPY %2:tgpr
144B	  tBL 14, $noreg, &"\94p\10\09", implicit-def $lr, implicit $sp, implicit killed $r0, implicit killed $r1, implicit killed $r2 :: (load 4 from constant-pool)
160B	  ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
176B	  ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
192B	  $r0 = COPY %0:tgpr
208B	  $r1 = COPY %2:tgpr
224B	  $r2 = COPY %1:tgpr
240B	  tBL 14, $noreg, &"", implicit-def $lr, implicit $sp, implicit killed $r0, implicit killed $r1, implicit killed $r2 :: (load 4 from constant-pool)
256B	  ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
272B	  ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
288B	  $r0 = COPY %1:tgpr
304B	  $r1 = COPY %0:tgpr
320B	  $r2 = COPY killed %2:tgpr
324B	  %18:tgpr = tLDRpci %const.0, 14, $noreg :: (load 4 from constant-pool)
328B	  %15:tgpr = COPY killed %18:tgpr
336B	  tBLXr 14, $noreg, %15:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
340B	  %12:tgpr = COPY killed %15:tgpr
344B	  %4:tgpr = COPY killed %12:tgpr
352B	  ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
368B	  ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
384B	  $r0 = COPY %1:tgpr
400B	  $r1 = COPY killed %1:tgpr
416B	  $r2 = COPY killed %0:tgpr
432B	  tBLXr 14, $noreg, killed %4:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
448B	  ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
464B	  tBX_RET 14, $noreg
> renamable $r6 = COPY $r2
> renamable $r5 = COPY $r1
> renamable $r4 = COPY $r0
> ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
> $r0 = COPY renamable $r4
> $r1 = COPY renamable $r5
> $r2 = COPY renamable $r6
> tBL 14, $noreg, &"\06", implicit-def $lr, implicit $sp, implicit killed $r0, implicit killed $r1, implicit killed $r2 :: (load 4 from constant-pool)
> ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
> ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
> $r0 = COPY renamable $r4
> $r1 = COPY renamable $r6
> $r2 = COPY renamable $r5
> tBL 14, $noreg, &"\06", implicit-def $lr, implicit $sp, implicit killed $r0, implicit killed $r1, implicit killed $r2 :: (load 4 from constant-pool)
> ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
> ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
> $r0 = COPY renamable $r5
> $r1 = COPY renamable $r4
> $r2 = COPY killed renamable $r6
> renamable $r6 = tLDRpci %const.0, 14, $noreg :: (load 4 from constant-pool)
> renamable $r6 = COPY killed renamable $r6
Identity copy: renamable $r6 = COPY killed renamable $r6
  deleted.
> tBLXr 14, $noreg, renamable $r6, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
> renamable $r3 = COPY killed renamable $r6
> renamable $r3 = COPY killed renamable $r3
Identity copy: renamable $r3 = COPY killed renamable $r3
  deleted.
> ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
> ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
> $r0 = COPY renamable $r5
> $r1 = COPY killed renamable $r5
> $r2 = COPY killed renamable $r4
> tBLXr 14, $noreg, killed renamable $r3, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
> ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
> tBX_RET 14, $noreg
-------------- next part --------------
A non-text attachment was scrubbed...
Name: llvm-611-2.diff
Type: text/x-patch
Size: 2620 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200415/aa64cfbd/attachment-0001.bin>


More information about the llvm-dev mailing list