[llvm-dev] linear-scan RA

Mon Sep 10 17:25:15 PDT 2018

> On Sep 10, 2018, at 5:11 PM, Preston Briggs <preston.briggs at gmail.com> wrote:
> 
> The phi instruction is irrelevant; just the way I think about things.
> The question is if the allocator believes that t0 and t2 interfere.
> 
> Perhaps the coalescing example was too simple.
> In the general case, we can't coalesce without a notion of interference.
> My worry is that looking at interference by ranges of instruction numbers
> leads to inaccuracies when a range is introduced by a copy.

I can't think of a case where we wouldn't have the same accuracy as a graph coloring allocator.
If you can construct an example where we don't I'd love to hear about it.

If you wonder about the liveness information, you can perform experiments like this (I'm using a NOOP instruction with an implicit use operand to produce some artificial uses).

$ cat test.mir
name: somefunc
body: |
  bb.0:
    %0:gr32 = MOV32ri 42
    JB_1 %bb.2, undef implicit $eflags
    JMP_1 %bb.2

  bb.1:
    %1:gr32 = MOV32ri 17
    JMP_1 %bb.3

  bb.2:
    NOOP implicit %0
    %1 = COPY %0
    JMP_1 %bb.3

  bb.3:
    NOOP implicit %1

$ llc -run-pass=liveintervals -debug-only=regalloc test.mir
********** INTERVALS **********
%0 [16r,64B:0)[112B,144r:0)  0 at 16r weight:0.000000e+00
%1 [80r,112B:1)[144r,176B:0)[176B,192r:2)  0 at 144r 1 at 80r 2 at 176B-phi weight:0.000000e+00
RegMasks:
********** MACHINEINSTRS **********
# Machine code for function somefunc: NoPHIs

0B	bb.0:
	  successors: %bb.2(0x80000000); %bb.2(100.00%)

16B	  %0:gr32 = MOV32ri 42
32B	  JB_1 %bb.2, implicit undef $eflags
48B	  JMP_1 %bb.2

64B	bb.1:
	  successors: %bb.3(0x80000000); %bb.3(100.00%)

80B	  %1:gr32 = MOV32ri 17
96B	  JMP_1 %bb.3

112B	bb.2:
	; predecessors: %bb.0
	  successors: %bb.3(0x80000000); %bb.3(100.00%)

128B	  NOOP implicit %0:gr32
144B	  %1:gr32 = COPY %0:gr32
160B	  JMP_1 %bb.3

176B	bb.3:
	; predecessors: %bb.1, %bb.2

192B	  NOOP implicit %1:gr32

# End machine code for function somefunc.

If you look at the "intervals" (the class is a misnomer since nowadays it contains a list of ranges...) in the beginning you see that %0 and %1 do not overlap anywhere.

- Matthias

> 
> But perhaps I should focus on the links and, as you suggested,
> the debugging info.
> 
> Thanks,
> Preston
> 
> 
> 
> 
> On Mon, Sep 10, 2018 at 5:02 PM, Matthias Braun <mbraun at apple.com <mailto:mbraun at apple.com>> wrote:
> 
> 
>> On Sep 10, 2018, at 4:53 PM, Preston Briggs <preston.briggs at gmail.com <mailto:preston.briggs at gmail.com>> wrote:
>> 
>> 
>> > The underlying liveness datastructure is a list of ranges where each vreg is alive
>> > (ranges in terms of instructions numbered). I remember a couple of later linear scan
>> > papers describing the same thing (Traub et.al <http://et.al/>. being the first if I remember correctly).
>> > That should be as accurate as you can get in terms of liveness information.
>> 
>> It depends on the details.
>> For example, given
>> 
>> t0 = mumble
>> if (something) {
>>   t2 = 3
>> }
>> else {
>>   t3 = t0 + 3
>>   print t0
>> }
>> t4 = phi(t2, t3)
>> 
>> it's clear that t2 and t0 shouldn't interfere,
>> but some folks might say the ranges overlap.
>> 
>> Similarly,
>> 
>> t6 = mumble
>> t7 = t6
>> t8 = t6 + 5
>> t9 = t7 + 10
>> print t8, t9
>> 
>> Chaitin points out that t6 and t7 shouldn't interfere,
>> even though the live ranges overlap.
> 
> - We go out of SSA form before allocating registers so you won't see phi instruction.
> - In the second case the copy coalesceing pass should have coalesced t6 and t7.
> 
> I would expect both cases to work as you expect.
> 
> - Matthias
> 
> 
>> 
>> Anyway, I'll look at the links.
>> 
>> Thanks,
>> Preston
>> 
>> 
>> 
>>  
>> 
>> We have separate aggressive coalescing pass before allocation. The register allocator will later perform live range splitting inside the main allocation loop as it seems fit.
>> 
>>> 
>>> I ask these questions because we (guys I work with) see loops
>>> where there's a little register juggling that seems unnecessary.
>> 
>> Well your best bet is to learn reading the output of `-mllvm -print-machineinstrs -mllvm -debug-only=regalloc` (which can take a while)...
>> 
>>> 
>>> Is there a paper that describes what y'all do?
>> 
>> I'm only aware of a blog post:
>> http://blog.llvm.org/2011/09/greedy-register-allocation-in-llvm-30.html <http://blog.llvm.org/2011/09/greedy-register-allocation-in-llvm-30.html>
>> and a dev conference talk in 2011:
>> https://llvm.org/devmtg/2011-11/ <https://llvm.org/devmtg/2011-11/>
>> 
>> - Matthias
>> 
>>> 
>>> Thanks,
>>> Preston
>>> 
>>> 
>>> On Mon, Sep 10, 2018 at 9:57 AM, Matthias Braun <mbraun at apple.com <mailto:mbraun at apple.com>> wrote:
>>> I would not describe LLVMs register allocator as linear scan, it's closer to graph coloring than linear scan IMO (though doesn't really matcher either approach).
>>> 
>>> RegAllocGreedy assigns the registers in an order based on the priority value computed in enqueu() we are not just scanning from top to bottom of the program. We also perform actual interference checks we just don't happen to build up an explicit interference graph up front.
>>> 
>>> - Matthias
>>> 
>>>> On Sep 10, 2018, at 9:49 AM, Preston Briggs via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>>>> 
>>>> Why have we ended up using linear-scan register allocation
>>>> by default (instead of, e.g., coloring)?
>>>> 
>>>> Thanks,
>>>> Preston
>>>> 
>>>> 
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>>> 
>>> 
>> 
>> 
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180910/a6b25fbb/attachment.html>