[llvm-commits] [LLVM, SwitchInst, case ranges] Auxiliary patch #1
Stepan Dyatkovskiy
stpworld at narod.ru
Wed Nov 2 11:42:26 PDT 2011
Hello, Duncan.
Duncan Sands wrote:
> I guess Anton can comment on codegen, but the fact that it doesn't make
> codegen
> harder has nothing to do with increasing the complexity of the
> optimizers, since
> they work at the IR level. It may be that case ranges allow the
> optimizers to
> do a better job. It may be that they simplify the optimizers. But it
> also may
> be the opposite: they might make switches harder to work with and reason
> about
> for no advantage. Which is it? Do you have an example where case ranges
> would
> result in better code, or make it easier to produce better code?
I made impact analysis for new case ranges feature.
24 out of more than 100 optimizations are affected. 20 of 24 just
require an integration of a new "case-range" type, i.e. small change of
code without. The remaining 4 requires some bigger changes. All affected
optimizers are listed in attached spreadsheet.
Patches that are submitted in this branch are just functionality
extension for current classes. These patches doesn't brake any of
existing optimizations and keeps its speed without changes.
Well. Let's enumerate 4 optimizations that should be reworked.
1. LowerSwitch::Clusterify
This method groups neighbouring cases (by value) that goes to the same
destination.
For example:
switch i32 %cond, label %default [
i32 1, label %successorA
i32 2, label %successorA
i32 5, label %successorB
i32 3, label %successorA
i32 6, label %successorB
]
will be grouped to the two clusters:
[[i32 1] .. [i32 3]], label %successorA
[[i32 5] .. [i32 6]], label %successorB
This method will work faster if clusters will presented explicitly using
new case ranges feature.
2. SimplifyCFG.cpp, TurnSwitchRangeIntoICmp (static function)
"Turns a switch that contains only an integer range comparison into a
sub, an icmp and a branch." (written in method comments). Algorithm that
determines "solid" case range should be changed.
Now compare two switches (don't look at syntax of second switch, it is
still a subject of another discussion):
switch i32 %cond, label %default [
i32 1, label %successorA
i32 2, label %successorA
i32 3, label %successorA
]
and hypothetical switch:
switch i32 %cond, label %default [
[[i32 1],[i32 3]], label %successorA ; case range [1..3]
]
or even this one:
switch i32 %cond, label %default [
[[i32 1],[i32 2]], label %successorA ; case range [1..2]
i32 3, label %successorA ; single case value "3"
]
Its obvious that last two switches will be processed faster than the
first one. We doesn't need to perform analysis for each separated case
value. We already know - that it is a range.
3. SimplifyCFG.cpp, EliminateDeadSwitchCases (static function).
Here switch condition is analysed. We try to determine "1" and "0" bits
that MUST be in condition value. If we found them, then we look at case
values; if these bits are absent in case value we remove it since it
will be never equal to condition.
I need to think more about the ways of case ranges processing here. At
least we can represent case range as separated values set and apply
current algorithm to it. It slow down the processing a little bit, but
the complexity itself will be not increased. I'm sure that there are
also exists algorithms that allows to eliminate whole case ranges: e.g.
we can apply current algorithm to high bits that are constant in case range.
4. lib/Transforms/Scalar/LoopUnswitch.cpp (the set of methods).
Just a quote from LoopUnswitch.cpp header
[quote]
This pass transforms loops that contain branches on loop-invariant
conditions
to have multiple loops. For example, it turns the left into the right
code.
for (...) if (lic)
A for (...)
if (lic) A; B; C
B else
C for (...)
A; C
[/quote]
I also must think more about case ranges unswithing here.
By now loops with switch instruction are unswitched value-by-value.
There is no any case-values clustering before unswitching. For example
for case range [0..9] we need to run unswitch process 10 times!
Theoretically, explicitly given case ranges and properly implemented
unswitching should make this optimization better.
So, as you can see complexity will not changed and even some of
optimizations will work faster.
Regards,
Stepan.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: CaseRanges - Passes Affected.xls
Type: application/vnd.ms-excel
Size: 11776 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20111102/a785acd2/attachment.xls>
More information about the llvm-commits
mailing list