[llvm-commits] [LLVM, SwitchInst, case ranges] Auxiliary patch #1

Wed Nov 2 11:42:26 PDT 2011

Hello, Duncan.

Duncan Sands wrote:
 > I guess Anton can comment on codegen, but the fact that it doesn't make
 > codegen
 > harder has nothing to do with increasing the complexity of the
 > optimizers, since
 > they work at the IR level. It may be that case ranges allow the
 > optimizers to
 > do a better job. It may be that they simplify the optimizers. But it
 > also may
 > be the opposite: they might make switches harder to work with and reason
 > about
 > for no advantage. Which is it? Do you have an example where case ranges
 > would
 > result in better code, or make it easier to produce better code?

I made impact analysis for new case ranges feature.
24 out of more than 100 optimizations are affected. 20 of 24 just 
require an integration of a new "case-range" type, i.e. small change of 
code without. The remaining 4 requires some bigger changes. All affected 
optimizers are listed in attached spreadsheet.

Patches that are submitted in this branch are just functionality 
extension for current classes. These patches doesn't brake any of 
existing optimizations and keeps its speed without changes.

Well. Let's enumerate 4 optimizations that should be reworked.

1. LowerSwitch::Clusterify

This method groups neighbouring cases (by value) that goes to the same 
destination.

For example:

switch i32 %cond, label %default [
    i32 1, label %successorA
    i32 2, label %successorA
    i32 5, label %successorB
    i32 3, label %successorA
    i32 6, label %successorB
]

will be grouped to the two clusters:

    [[i32 1] .. [i32 3]], label %successorA
    [[i32 5] .. [i32 6]], label %successorB

This method will work faster if clusters will presented explicitly using 
new case ranges feature.

2. SimplifyCFG.cpp, TurnSwitchRangeIntoICmp (static function)

"Turns a switch that contains only an integer range comparison into a 
sub, an icmp and a branch." (written in method comments). Algorithm that 
determines "solid" case range should be changed.

Now compare two switches (don't look at syntax of second switch, it is 
still a subject of another discussion):

switch i32 %cond, label %default [
    i32 1, label %successorA
    i32 2, label %successorA
    i32 3, label %successorA
]

and hypothetical switch:

switch i32 %cond, label %default [
    [[i32 1],[i32 3]], label %successorA ; case range [1..3]
]

or even this one:

switch i32 %cond, label %default [
    [[i32 1],[i32 2]], label %successorA ; case range [1..2]
    i32 3, label %successorA ; single case value "3"
]

Its obvious that last two switches will be processed faster than the 
first one. We doesn't need to perform analysis for each separated case 
value. We already know - that it is a range.

3. SimplifyCFG.cpp, EliminateDeadSwitchCases (static function).

Here switch condition is analysed. We try to determine "1" and "0" bits 
that MUST be in condition value. If we found them, then we look at case 
values; if these bits are absent in case value we remove it since it 
will be never equal to condition.
I need to think more about the ways of case ranges processing here. At 
least we can represent case range as separated values set and apply 
current algorithm to it. It slow down the processing a little bit, but 
the complexity itself will be not increased. I'm sure that there are 
also exists algorithms that allows to eliminate whole case ranges: e.g. 
we can apply current algorithm to high bits that are constant in case range.

4. lib/Transforms/Scalar/LoopUnswitch.cpp (the set of methods).

Just a quote from LoopUnswitch.cpp header

[quote]
  This pass transforms loops that contain branches on loop-invariant 
conditions
  to have multiple loops.  For example, it turns the left into the right 
code.

   for (...)                  if (lic)
     A                          for (...)
     if (lic)                     A; B; C
       B                      else
     C                          for (...)
                                  A; C
[/quote]

I also must think more about case ranges unswithing here.
By now loops with switch instruction are unswitched value-by-value. 
There is no any case-values clustering before unswitching. For example 
for case range [0..9] we need to run unswitch process 10 times! 
Theoretically, explicitly given case ranges and properly implemented 
unswitching should make this optimization better.

So, as you can see complexity will not changed and even some of 
optimizations will work faster.

Regards,
Stepan.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: CaseRanges - Passes Affected.xls
Type: application/vnd.ms-excel
Size: 11776 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20111102/a785acd2/attachment.xls>