[llvm-commits] [LLVM, SwitchInst, case ranges] Auxiliary patch #1

Fri Dec 9 11:02:57 PST 2011

ping.

Stepan Dyatkovskiy wrote:
> Ping.
>
> -Stepan.
>
> Stepan Dyatkovskiy wrote:
>> ping.
>>
>> -Stepan.
>>
>> Stepan Dyatkovskiy wrote:
>>> ping.
>>> -Stepan.
>>>
>>> Stepan Dyatkovskiy wrote:
>>>> Hello, Duncan.
>>>>
>>>> Duncan Sands wrote:
>>>>     >    I guess Anton can comment on codegen, but the fact that it doesn't make
>>>>     >    codegen
>>>>     >    harder has nothing to do with increasing the complexity of the
>>>>     >    optimizers, since
>>>>     >    they work at the IR level. It may be that case ranges allow the
>>>>     >    optimizers to
>>>>     >    do a better job. It may be that they simplify the optimizers. But it
>>>>     >    also may
>>>>     >    be the opposite: they might make switches harder to work with and reason
>>>>     >    about
>>>>     >    for no advantage. Which is it? Do you have an example where case ranges
>>>>     >    would
>>>>     >    result in better code, or make it easier to produce better code?
>>>>
>>>> I made impact analysis for new case ranges feature.
>>>> 24 out of more than 100 optimizations are affected. 20 of 24 just
>>>> require an integration of a new "case-range" type, i.e. small change of
>>>> code without. The remaining 4 requires some bigger changes. All affected
>>>> optimizers are listed in attached spreadsheet.
>>>>
>>>> Patches that are submitted in this branch are just functionality
>>>> extension for current classes. These patches doesn't brake any of
>>>> existing optimizations and keeps its speed without changes.
>>>>
>>>> Well. Let's enumerate 4 optimizations that should be reworked.
>>>>
>>>> 1. LowerSwitch::Clusterify
>>>>
>>>> This method groups neighbouring cases (by value) that goes to the same
>>>> destination.
>>>>
>>>> For example:
>>>>
>>>> switch i32 %cond, label %default [
>>>> i32 1, label %successorA
>>>> i32 2, label %successorA
>>>> i32 5, label %successorB
>>>> i32 3, label %successorA
>>>> i32 6, label %successorB
>>>> ]
>>>>
>>>> will be grouped to the two clusters:
>>>>
>>>> [[i32 1] .. [i32 3]], label %successorA
>>>> [[i32 5] .. [i32 6]], label %successorB
>>>>
>>>> This method will work faster if clusters will presented explicitly using
>>>> new case ranges feature.
>>>>
>>>> 2. SimplifyCFG.cpp, TurnSwitchRangeIntoICmp (static function)
>>>>
>>>> "Turns a switch that contains only an integer range comparison into a
>>>> sub, an icmp and a branch." (written in method comments). Algorithm that
>>>> determines "solid" case range should be changed.
>>>>
>>>> Now compare two switches (don't look at syntax of second switch, it is
>>>> still a subject of another discussion):
>>>>
>>>> switch i32 %cond, label %default [
>>>> i32 1, label %successorA
>>>> i32 2, label %successorA
>>>> i32 3, label %successorA
>>>> ]
>>>>
>>>> and hypothetical switch:
>>>>
>>>> switch i32 %cond, label %default [
>>>> [[i32 1],[i32 3]], label %successorA ; case range [1..3]
>>>> ]
>>>>
>>>> or even this one:
>>>>
>>>> switch i32 %cond, label %default [
>>>> [[i32 1],[i32 2]], label %successorA ; case range [1..2]
>>>> i32 3, label %successorA ; single case value "3"
>>>> ]
>>>>
>>>> Its obvious that last two switches will be processed faster than the
>>>> first one. We doesn't need to perform analysis for each separated case
>>>> value. We already know - that it is a range.
>>>>
>>>> 3. SimplifyCFG.cpp, EliminateDeadSwitchCases (static function).
>>>>
>>>> Here switch condition is analysed. We try to determine "1" and "0" bits
>>>> that MUST be in condition value. If we found them, then we look at case
>>>> values; if these bits are absent in case value we remove it since it
>>>> will be never equal to condition.
>>>> I need to think more about the ways of case ranges processing here. At
>>>> least we can represent case range as separated values set and apply
>>>> current algorithm to it. It slow down the processing a little bit, but
>>>> the complexity itself will be not increased. I'm sure that there are
>>>> also exists algorithms that allows to eliminate whole case ranges: e.g.
>>>> we can apply current algorithm to high bits that are constant in case
>>>> range.
>>>>
>>>> 4. lib/Transforms/Scalar/LoopUnswitch.cpp (the set of methods).
>>>>
>>>> Just a quote from LoopUnswitch.cpp header
>>>>
>>>> [quote]
>>>> This pass transforms loops that contain branches on loop-invariant
>>>> conditions
>>>> to have multiple loops. For example, it turns the left into the right code.
>>>>
>>>> for (...) if (lic)
>>>> A for (...)
>>>> if (lic) A; B; C
>>>> B else
>>>> C for (...)
>>>> A; C
>>>> [/quote]
>>>>
>>>> I also must think more about case ranges unswithing here.
>>>> By now loops with switch instruction are unswitched value-by-value.
>>>> There is no any case-values clustering before unswitching. For example
>>>> for case range [0..9] we need to run unswitch process 10 times!
>>>> Theoretically, explicitly given case ranges and properly implemented
>>>> unswitching should make this optimization better.
>>>>
>>>> So, as you can see complexity will not changed and even some of
>>>> optimizations will work faster.
>>>>
>>>> Regards,
>>>> Stepan.
>>>>
>>>>
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits