[LLVMdev] Introducing a branch optimization and prediction pass

Török Edwin edwintorok at gmail.com
Mon Mar 31 13:55:00 PDT 2008

Evan Cheng wrote:
>> Good idea. However in absence of profiling info there should be some
>> heuristics, I am not sure what that could be ATM.
> This can be done later. I think the first step should be to get the  
> transformation piece ready.


> I think, at least for x86 (32 bit), we should start by targeting very  
> restricted cases, i.e. tiny basic blocks. Then you can add heuristics  
> to compute the register pressure, etc.


>> That would mean to apply this optimization on machine-basic-blocks,  
>> right?
>> I was thinking of a generic llvm IR optimization pass, but maybe
>> machine-basic-blocks pass is better, since we are doing something very
>> specific for targets.
> Right. I think you want to do this in codegen where you have target  
> information. Of course, this means propagating branch prediction  
> information through optimization passes. That can be a real pain.
> Once that complexity is solved, making use of the if-converter should  
> be fairly easy. Right now, if-conversion predicate BB's which are  
> completely predicable. You can teach it speculation by merging blocks  
> and introduce merge blocks which are made of a number of move  
> instructions. This should be pretty trivial for triangle sub-cfg's.
> However, the current if-converter is a post-register allocation pass.  
> That might not be right for your needs (since speculation will have to  
> introduce new temporaries). One possibility is to add the speculation  
> pass before register allocation, but leave the mov -> cmov job to the  
> if-converter.

I'll have a look at the if-converter sometime this weekend.

>> I have a prototype of this optimization (without branch prediction),  
>> and
>> I can use that to experiment.
>> However it only handles some very simple cases, and I should rewrite  
>> it
>> to be more generic.
> Sounds good. Thanks.
> Evan

I wrote some small tests, see the attachments for a gcc 4.2, 4.3, icc,
llvm, llvm+branchopt comparison.
I didn't try tweaking compiler flags much.

On x86-64:

+ ./loops_llvm
CPU user time: 76661, speed: 1304.444Mb/s
CPU user time: 119993, speed: 833.382Mb/s
CPU user time: 173322, speed: 576.961Mb/s
CPU user time: 133324, speed: 750.053Mb/s
CPU user time: 676623, speed: 147.793Mb/s
CPU user time: 776616, speed: 128.764Mb/s
CPU user time: 739952, speed: 135.144Mb/s
CPU user time: 903274, speed: 110.708Mb/s
+ ./loops_llvm_branchopt
CPU user time: 79995, speed: 1250.078Mb/s
CPU user time: 123325, speed: 810.866Mb/s
CPU user time: 169989, speed: 588.273Mb/s
CPU user time: 133325, speed: 750.047Mb/s
CPU user time: 679956, speed: 147.068Mb/s
CPU user time: 329978, speed: 303.051Mb/s
CPU user time: 543298, speed: 184.061Mb/s
CPU user time: 596628, speed: 167.609Mb/s

On x86-32:

+ ./loops_llvm
CPU user time: 96661, speed: 1034.543Mb/s
CPU user time: 129991, speed: 769.284Mb/s
CPU user time: 256650, speed: 389.636Mb/s
CPU user time: 163323, speed: 612.284Mb/s
CPU user time: 739951, speed: 135.144Mb/s
CPU user time: 829946, speed: 120.490Mb/s
CPU user time: 803281, speed: 124.489Mb/s
CPU user time: 1086596, speed: 92.031Mb/s
+ ./loops_llvm_branchopt
CPU user time: 103326, speed: 967.811Mb/s
CPU user time: 129992, speed: 769.278Mb/s
CPU user time: 256650, speed: 389.636Mb/s
CPU user time: 166656, speed: 600.038Mb/s
CPU user time: 739951, speed: 135.144Mb/s
CPU user time: 593295, speed: 168.550Mb/s
CPU user time: 939939, speed: 106.390Mb/s
CPU user time: 1066597, speed: 93.756Mb/s

It benefits x86-64 on the last 3 tests, while it benefits x86-32 only on
1 test, and hurts another one.

I haven't looked at the generated assembly yet, that is the next step.
However tests 4 and 5 should have same runtime, yet they differ hugely.
I'll file bugs.

Best regards,
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: log_64
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080331/4b77e66a/attachment.ksh>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: loops.c
Type: text/x-csrc
Size: 4802 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080331/4b77e66a/attachment.c>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: log_32
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080331/4b77e66a/attachment-0001.ksh>

More information about the llvm-dev mailing list