[llvm-commits] [LLVM, SwitchInst, case ranges] Auxiliary patch #1

Wed Dec 7 12:47:48 PST 2011

Ping.

-Stepan.

Stepan Dyatkovskiy wrote:
> ping.
>
> -Stepan.
>
> Stepan Dyatkovskiy wrote:
>> ping.
>> -Stepan.
>>
>> Stepan Dyatkovskiy wrote:
>>> Hello, Duncan.
>>>
>>> Duncan Sands wrote:
>>>    >   I guess Anton can comment on codegen, but the fact that it doesn't make
>>>    >   codegen
>>>    >   harder has nothing to do with increasing the complexity of the
>>>    >   optimizers, since
>>>    >   they work at the IR level. It may be that case ranges allow the
>>>    >   optimizers to
>>>    >   do a better job. It may be that they simplify the optimizers. But it
>>>    >   also may
>>>    >   be the opposite: they might make switches harder to work with and reason
>>>    >   about
>>>    >   for no advantage. Which is it? Do you have an example where case ranges
>>>    >   would
>>>    >   result in better code, or make it easier to produce better code?
>>>
>>> I made impact analysis for new case ranges feature.
>>> 24 out of more than 100 optimizations are affected. 20 of 24 just
>>> require an integration of a new "case-range" type, i.e. small change of
>>> code without. The remaining 4 requires some bigger changes. All affected
>>> optimizers are listed in attached spreadsheet.
>>>
>>> Patches that are submitted in this branch are just functionality
>>> extension for current classes. These patches doesn't brake any of
>>> existing optimizations and keeps its speed without changes.
>>>
>>> Well. Let's enumerate 4 optimizations that should be reworked.
>>>
>>> 1. LowerSwitch::Clusterify
>>>
>>> This method groups neighbouring cases (by value) that goes to the same
>>> destination.
>>>
>>> For example:
>>>
>>> switch i32 %cond, label %default [
>>> i32 1, label %successorA
>>> i32 2, label %successorA
>>> i32 5, label %successorB
>>> i32 3, label %successorA
>>> i32 6, label %successorB
>>> ]
>>>
>>> will be grouped to the two clusters:
>>>
>>> [[i32 1] .. [i32 3]], label %successorA
>>> [[i32 5] .. [i32 6]], label %successorB
>>>
>>> This method will work faster if clusters will presented explicitly using
>>> new case ranges feature.
>>>
>>> 2. SimplifyCFG.cpp, TurnSwitchRangeIntoICmp (static function)
>>>
>>> "Turns a switch that contains only an integer range comparison into a
>>> sub, an icmp and a branch." (written in method comments). Algorithm that
>>> determines "solid" case range should be changed.
>>>
>>> Now compare two switches (don't look at syntax of second switch, it is
>>> still a subject of another discussion):
>>>
>>> switch i32 %cond, label %default [
>>> i32 1, label %successorA
>>> i32 2, label %successorA
>>> i32 3, label %successorA
>>> ]
>>>
>>> and hypothetical switch:
>>>
>>> switch i32 %cond, label %default [
>>> [[i32 1],[i32 3]], label %successorA ; case range [1..3]
>>> ]
>>>
>>> or even this one:
>>>
>>> switch i32 %cond, label %default [
>>> [[i32 1],[i32 2]], label %successorA ; case range [1..2]
>>> i32 3, label %successorA ; single case value "3"
>>> ]
>>>
>>> Its obvious that last two switches will be processed faster than the
>>> first one. We doesn't need to perform analysis for each separated case
>>> value. We already know - that it is a range.
>>>
>>> 3. SimplifyCFG.cpp, EliminateDeadSwitchCases (static function).
>>>
>>> Here switch condition is analysed. We try to determine "1" and "0" bits
>>> that MUST be in condition value. If we found them, then we look at case
>>> values; if these bits are absent in case value we remove it since it
>>> will be never equal to condition.
>>> I need to think more about the ways of case ranges processing here. At
>>> least we can represent case range as separated values set and apply
>>> current algorithm to it. It slow down the processing a little bit, but
>>> the complexity itself will be not increased. I'm sure that there are
>>> also exists algorithms that allows to eliminate whole case ranges: e.g.
>>> we can apply current algorithm to high bits that are constant in case
>>> range.
>>>
>>> 4. lib/Transforms/Scalar/LoopUnswitch.cpp (the set of methods).
>>>
>>> Just a quote from LoopUnswitch.cpp header
>>>
>>> [quote]
>>> This pass transforms loops that contain branches on loop-invariant
>>> conditions
>>> to have multiple loops. For example, it turns the left into the right code.
>>>
>>> for (...) if (lic)
>>> A for (...)
>>> if (lic) A; B; C
>>> B else
>>> C for (...)
>>> A; C
>>> [/quote]
>>>
>>> I also must think more about case ranges unswithing here.
>>> By now loops with switch instruction are unswitched value-by-value.
>>> There is no any case-values clustering before unswitching. For example
>>> for case range [0..9] we need to run unswitch process 10 times!
>>> Theoretically, explicitly given case ranges and properly implemented
>>> unswitching should make this optimization better.
>>>
>>> So, as you can see complexity will not changed and even some of
>>> optimizations will work faster.
>>>
>>> Regards,
>>> Stepan.
>>>
>>>
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits