<div dir="ltr">+Lang who might still have some register allocation knowledge kicking around (but his responses may be a bit delayed)<br><br>Generally there's a bunch of algorithms that don't scale really well on especially large function - and I'd encourage users to modify any code generators to chunk up functions, or they'll be chasing down these sort of issues indefinitely, really.<br><br>But improvements are certainly appreciated - though I don't have any particular knowledge or pointers regarding how to improve the register allocator. (my gut reaction would be that probably a lot of people have looked and haven't found better tradeoffs - but I'm a bit of a pessimist, fresh eyes can often help :) )</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Feb 10, 2021 at 10:48 AM Dwight Guth via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi,<br>

<br>

I am the maintainer of an open source compiler that makes use of the<br>

LLVM toolchain to perform code generation. A user of our compiler, who<br>

is working on a proprietary project, recently shared with me an<br>

example where our toolchain was invoking `llc -O0` on a file with a<br>

rather large function in it, and the register allocation pass was<br>

taking upwards of 5 minutes. This seemed unusual to me, so I<br>

downloaded the debug symbols for LLVM (this was with LLVM 11.0.0), and<br>

I found that slightly over 50% of the runtime of LLC was spent within<br>

a very small loop within llvm::MachineRegisterInfo::hasOneNonDBGUse. I<br>

believe this loop to correspond to the while loop in<br>

llvm::MachineRegisterInfo::defusechain_iterator::advance.<br>

<br>

It looks like this function is just scanning through all the defs and<br>

uses of a register in the function until it finds one that it<br>

considers satisfactory? That seems like it would be introducing<br>

behavior that is quadratic in the size of the function to me. Am I<br>

missing something? Is there some other, more sensible reason why the<br>

profile seems so dependent on this one loop of code? Did I<br>

misunderstand something?<br>

<br>

My end goal here would be to submit a patch that might optimize this<br>

case, since it seems to me something that might be able to be computed<br>

more efficiently. But I don't understand the code or the algorithm<br>

hugely well, so I was hoping someone could give me some pointers on<br>

where to get started with something like this.<br>

<br>

Does anyone have any suggestions on what I could do to make register<br>

allocation run faster on quite large functions?<br>

<br>

Thanks,<br>

<br>

--<br>

Dwight Guth<br>

<br>

Chief Information Officer<br>

Runtime Verification, Inc.<br>

Email: <a href="mailto:dwight.guth@runtimeverification.com" target="_blank">dwight.guth@runtimeverification.com</a><br>

_______________________________________________<br>

LLVM Developers mailing list<br>

<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>

<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>

</blockquote></div>