[llvm-commits] Please review the patch for IntegersSubsetMapping

Sun Jul 8 06:08:43 PDT 2012

Hi Benjamin,

It is PR1255 related patch.
The discussion was finished here (the last key post from Chris):
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120213/136954.html 

After long discussion we decided to replace type of Switch case value 
from ConstantInt to subset-of-integers (based on APInts). Chris proposed 
pretty nice way to pack these subsets. Generically it means that for 
Switches I need to replace operations with integers with operations with 
subsets of integers.

Mainly I want to explain what I'm doing in current patch. And a little 
bit about what IntegersSubsetMapping actually is.

Here we need to up the Set Theory in our memory.

The patch contains changes in "diff" operation.
The purpose of "diff" operation is to make "intersection" and "exclude" 
operations on subsets of integers at the same time.
The subset of integers is generic mathematics term here, and operations 
listed above are the standard set operations:
http://en.wikipedia.org/wiki/Intersection_%28set_theory%29
http://en.wikipedia.org/wiki/Complement_%28set_theory%29
Short describtion of each of them is below:
Let we have two subsets:

// U - means union here and everywhere below. So LHS is union of two 
intervals.
LHS = {[1..3] U [8..12]}

RHS = {[2..9]}

Then,

LHS intersect RHS = {[2..3] U [8..9]}

So "LHS intersect RHS" contains numbers that are belongs both to LHS and 
to RHS.
Let consider "exclusion" or LHS-without-RHS (Complement of set RHS with 
respect to set LHS):

EXCLUSION = LHS exclude RHS = {[1..1] U [10..12]}

"LHS exclude RHS" contains numbers that are in LHS, but not in RHS.

SwitchInst may be represented as *mapping* + default successor:
Here I mean that we combine all cases into the single subset, but each 
number or range in subset may be linked with some successor: BasicBlock 
or MachineBasicBlock (or something else):
A = { ([1..3] => successorA) U ([8..12] => successorB) }

Really I need "intersect" and "exclude" for *mappings*:
Let,

A = { ([1..3] => successorA) U ([8..12] => successorB) }
B = {[2..9] => successorA}

Then,

// Always keep LHS successors in result.
A intersect B = { ([2..3] => successorA) U ([8..9] => successorA) }
B intersect A = { ([2..3] => successorA) U ([8..9] => successorB) }
A exclude B = { ([1..1] => successorA) U ([10..12] => successorB) }

Looks complex. But some optimizations became looks simpler after that.
Consider next code:

switch (c) {
   case '0'..'9' U 'A'..'F': // Call this subset of chars "DIGITS_SUBSET"
     switch(c) {
       case '0'..'1':
         tryProcessBit(c);
         break;
       case '2'..'9':
         tryProcessDecs(c);
         break;
       default:
         tryProcessHexs(c);
     }
     break;
   case '\\":
     processEscape();
     break;
   default:
     processSomethingElse();
}

SimplifyCFG pass will optimize two these folded switches to the single one:

switch (c) {
   case '0'..'1': // BITS_SPECIFIC_SUBSET
     tryProcessBit(c);
     break;
   case '2'..'9': // DECS_SPECIFIC_SUBSET
     tryProcessDecs(c);
     break;
   case 'A'..'F': // HEXS_SPECIFIC_SUBSET
     tryProcessHexs(c);
     break;
   default:
     processSomethingElse();
}

To do that we need to manipulate with subsets. We need to replace the 
successors of parent switch with matched successors of child switch. If 
some parent
switch case sends us to the child switch, but child switch hasn't such 
cases (look at tryProcessHexs), we need to insert child switch's default 
successor for these cases.
Then we can remove copied successors from child switch or probably 
remove whole child instruction.
Currently it is implemented in
SimplifyCFG::SimplifyEqualityComparisonWithOnlyPredecessor
and in
SimplifyCFG::FoldValueComparisonIntoPredecessors.
Case values are still represented as numbers here.

But if we represent switch as mapping described below,
this optimization may be easily done with next way:

// It as parent switch cases that are represented as
// mapping described above:
ParentSwitchMapping = {
                        (['0'..'9'] => ChildSwitchSuccessor) U
                        (['A'..'F'] => ChildSwitchSuccessor) U
                        (['\\'..'\\'] => ProcessEscapeSuccessor)
                       }

// ...is child switch.
ChildSwitchMapping = {
                       (['0'..'1'] => ProcessBitsSuccessor) U
                       (['2'..'9'] => ProcessDecsSuccessor)
                      }

Now we need to make next transformation on ParentSwitchMapping:

ToChild = ParentSwitchMapping.detachCasesFor(ChildSwitchSuccessor);

ToChildDef = ToChild exclude ChildSwitchMapping;
NewCasesInParent = ChildSwitchMapping intersect ParentSwitchMapping;
NewChildSwitchMapping = ChildSwitchMapping exclude ParentSwitchMapping;

ParentSwitchMapping.add(ToChildDef, ChildSwitchDefaultSuccessor 
/*replace successors*/);
ParentSwitchMapping.add(NewCasesInParent /*keep successors*/);

Then we can update the parent switch and remove the child switch if the 
new child mapping is empty.

Here we need to make three operations: two excludes and one 
intersection. Intuitively we can see, that two last operations may be 
combined:
  - It has the same arguments.
  - In first operations we collect items that contained by each subsets 
and waste all others.
  - In second operation we collect items that are only in 
ChildSwitchMapping (was wasted in previous operations).

So I implemented diff operation that can perform at the same time three 
operations:
"A exclude B" (l-exclude), "A intersect B" (intersection) and "B exclude 
A" (r-exclude).

How Diff operation works.
On input we have two subsets: A and B with sorted and non overlapped 
ranges inside. Then we enumerate the points of subsets in loop.
E.g. for A = { [0..5] } and B = { [3..7] } the points will came in next 
order: 0,3,5,7.
The point may be 4 types:
- Point opens range in A
- Point closes range in A
- Point opens range in B
- Point closes range in B
Inside the loop we analyze incoming point and our current state:
If we got point that opens range in A (in our example it is 0), and 
currently we have not point that opens range in B, then we can "start" 
interval for "l-exclude".
Then (in my example) we got point that opens range in B (it is number 
3). We have already opened range in A. Close range in "l-exclude" (add 
range 0..2), and open range in "intersection" with point 3.
Then we got point that closes range in A (5). Close range in 
intersection (add range 3..5). Open range in "r-exclude" with point 6.
Then we got point 7 that closes range in B. Close range in "r-exclude" 
(add range 6..7).

This algorithm is implemented as for-loop in 
IntegersSubsetMapping::diff. In this loop we enumerate points in ranges 
A ("this" pointer) and B (RHS). We determine the type of point and send 
event to the state machine IntegersSubsetMapping::DiffStateMachine.

The generic algorithm has complexity N+M, where N is size of left 
incoming mapping, and M is size of the right one.

But the generic implementation works slowly when subsets contains single 
numbers instead of ranges: we need convert single number to trivial 
range that is expensive for CPU. If we also take into account that most 
of switch instructions works with single numbers instead of ranges, we 
can conclude that we need optimization for such cases.

The patch contains optimization for case when LHS and RHS contains 
single numbers only.
So if I see that, I invoke diffSingleNumbers. It works simpler than 
generic algorithm.
Here we also enumerate the points from both subsets in single loop, but 
our reaction is simpler:
- If point came from A - add it to l-exclude.
- If point came from B - add it to r-exclude.
- If point came from A and from B - add it to intersection.

Main purpose of "diff" operation is to update methods in SimplifyCFG. If 
we will use generic algorithm only, we got big regressions for usual 
cases. Optimization for single numbers allows to keep performance usual 
switches on current level, and the generic algorithm allows to improve 
performance of switches with subset-based cases.

So that's all. Sorry for big amount of letters. Any questions are welcome.

Thanks for your attention :-)

-Stepan.

Benjamin Kramer wrote:
>
> On 04.07.2012, at 08:15, Stepan Dyatkovskiy wrote:
>
>> A few words about diff operation.
>> Let we have two subsets of integers. diff operation per signle pass allows to calculate
>> LHS exclude RHS
>> RHS exclude LHS
>> and
>> LHS intersect RHS.
>>
>> Subset is represented as set of ranges and single numbers.
>> Generic algorithm may be greatly optimized for case when both of subsets contains single numbers only.
>> Some optimizations are possible when one of LHS or RHS contains big amount of single numbers.
>
>
> Hi Stepan,
>
> I find it really hard to read this patch. Can you add some explanatory comments on what happens here?
> Also, if it is a performance optimization it would be really nice to have some numbers to justify the added complexity.
>
> On the code style side there are still some minor issues with this patch, some while loops are missing the space between "while" and the opening parenthesis.
>
> Last but not least an earlier iteration of this patch had caused PR13256, you should include a regression test for the bug so it doesn't crop up again.
>
> - Ben
>