[LLVMdev] BigBlock register allocator

Evan Cheng evan.cheng at apple.com
Fri Jun 22 13:13:49 PDT 2007


On Jun 22, 2007, at 7:39 AM, duraid at octopus.com.au wrote:
>
>
> chooseReg is of course where most of the time is spent, but truth  
> be told,
> 98% of the time my application spends in LLVM is *not* in the register
> allocator. Or I should say *was* not - a quick patch by Evan Cheng  
> earlier
> today to some scheduler code more than doubled LLVM's speed on large
> functions. However, there are still some very serious LLVM performance
> issues when codegenning large basic blocks. A couple of years ago,
> I clocked the LLVM JIT at roughly 10,000 cycles per byte of code
> generated. Currently, it's *well* over 200,000. (Again, this is on  
> large
> functions. On small functions, the performance is presumably much  
> better.)
> Of course, back then, LLVM didn't *have* a scheduler, and even  
> things like
> instruction selection were simpler. But somewhere, some nasty speed  
> issues
> have crept in when compiling huge functions. I'll do my best to try  
> and sift
> through these over the next few weeks. (Or at least provide Evan  
> with plenty
> of test cases and beg him to fix things. ;)

Instruction selection is now much more complex. We have also added  
some more passes such as instruction scheduling, branch folding. LSR  
is now enabled for X86, etc.

One thing that you should do for JIT is to disable code size  
optimizations such as branch folding. We usually don't like codegen  
for static compilation and JIT to differ. However, I think it's time  
to refine that policy. Please add a llc option that emulate JIT  
codegen (except for the code emission part, of course). Perhaps  
something like llc -emulate-jit? That will make it easier to  
reproduce JIT codegen bugs.

Evan

>
>>  - PhysRegsUseOrder - you remove some elements from the middle of  
>> this
>> vector in removePhysReg. This is not a very efficient operation on  
>> the
>> vectors, since it need to copy the tail of the vector. I think  
>> using a
>> list data-structure could be much more efficient for this purpose
>
> Actually, PhysRegsUseOrder isn't really needed for BigBlock - it's
> a leftover from the local allocator. So there are even more brutal
> performance fixes to be made to BigBlock, however your suggestion
> applies to the Local allocator for sure!
>
>> I think these changes may significantely improve the performance of
>> your BigBlock register allocator. I'll try to come up with some more
>> concrete proposals or even patches over the week-end or next week.
>
> Patches would be a dream - while I don't expect BigBlock to ever be
> more than ~25% of total codegen time, every little bit helps. :)
>
> Thanks for your comments,
>
>     Duraid
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




More information about the llvm-dev mailing list